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Abstract 

The purpose of this paper is to explore several current 
research directions in the fields of ; digital signal processing 
cind modern control and estimation theory . We examine topics 
such as stability theory, linear prediction, and parameter 
identification, system synthesis and implementation, two- 
dimensional filtering, decentralized control and estimation, 
and image processing, in order to uncover some of the basic 
similarities and differences in the goals, techniques, and 
philosophy of the two disciplines. 
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Introduction 

The writing of this paper was motivated by the belief that the fields 
of digital signal processing and control and estimation theory possess 

I 

enough similarities and differences in philosophy, goals, and analytical 
techniques to merit a detailed joint examination. In order to explore the 
relationship between these two fields, I found it essential to concentrate 

I 

on several specific research directions to provide a focus for my investiga- 

I ■ 

i 

tions. The results of this study were a talk delivered during the 1976 IEEE 
Arden House Workshop on Digital Signal Processing, the present paper, and a 
far more comprehensive manuscript [Zl] . 

Although the paper consists of discussions of several specific reseaorch 
directions, the primary emphasis of this paper is not on results. Rather, 
li have been far more interested in understanding the goals of the research 
eind the methods and approach used by workers in both fields. Understanding 
the goals may help us to see why the techniques used in the two disciplines 
differ. Inspecting the methods and approaches may allow one to see areas 
in which concepts in one field may be usefully applied in the other. In 
summary, the primary goal .of this study is to provide a basis for future 
collaboration among researchers in both fields. 

It is hoped that the above comments will help explain the spirit in 
which this paper has been written. In reading this paper, the reader may 

' i 

find many comments that are either partially or totally unsubstantiated. 


These points have been included in keeping with the speculative nature of 
the study. However, I have attempted to provide background for the specula- 
tion and have limited these comments to questions which I feel represent 



exciting opportunities for interaction and collaboration. Clearly these 
issues must be studied at a far deeper level than is possible in this initial 
effort. To aid others who may wish to follow up on some of the directions 
developed in this paper, an extensive bibliography has been included. In 
addition, the interested reader is referred to [Zl] in which all of these 
research directions are explored in substantially greater breadth and detail. 

Nowhere in the paper have I made a direct attempt to define the fields 
of digital signal processing and control and estimation. Rather,; I hope 
that by examining many of the issues of importance to workers in these 
fields, the reader will be able to piece together a picture of the disci- 
plines and their relationship to each other. As a preface to oxir examina- 
tion, let me mention several points concerning each field. 

* 

In digital signal processing, one of the crucial problems is the design 
of an implementable system meeting certain given design specifications such 
as an ideal frequency response. Here the emphasis often is on the word 
implementable , with a fair amount of attention paid to issues such as the 
structure of the digital filter, its complexity, in terms of architecture 
and computation time, the effect of finite wordlength on ; performance , etc. 
Much of this attention is motivated by the need for extremely efficient 
systems to perform complex signal processing tasks (e.g. , the implementation 
of high order recursive or nonrecursive filters) at very high data rates 
(such as those encountered in speech processing, where one runs into sampling 
rates on the order of 10 kHz). 

In control and estimation, the emphasis has been far less on implemen- 
tation 'and more on developing methods for determining system design specif i- 




I 

editions for estimation or control systems. At one level these specif ica- 

i 

tilons are just a particular class of design guidelines which can then be 

i 

used to construct cui implementable digital system. However, there are 
ma.jor differences between the systems arising in the control context and the 
typical digital processing application. For one thing, the data rates for 
control systems are often far lower (e.g. in aircraft control systems 
sampling rates on the order of .1 kHz are often encountered) . More fvinda- 
mentally, however, the signal processing to be done in a control system 
cannot be judged by itself, as can other signal processing systems, since 

it is part of a feedback loop, and the effect of the processing must be 

I ■ ^ • 

studied in the context of its closed loop effects. 

Also, many modem control and estimation techniques involve the use 

j ■ 

of a state space formulation, as opposed to input-output desicriptions which 

I . 

I ; : 

are usually encountered in digital signal processing applications. Some 

of the reasons for this difference will be made clear in the following sec- 
tions, but one implication is immediately evident. The use of a state-space 
description implies that the system under consideration is causal. In 
standard feedback control problems this is clearly the ease, and thus state- 
space formulations make a great deal of! sense. As we’ll see, there are z 
digital signal processing problems involving noncausal systems or systems 
in which the independent variable has nothing to do with time auid for which 
causality has no intrinsic meaning. Thus, while we will find several places 
in which state space concepts fit in naturally in the digital signal pro- 
cessing context, we'll also find others in which that is decidedly not the 
case. 
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The preceding conments were made in order to provide the reader with 
soiie insight into the perspective I have taken in writing this paper. With 
I this as background, let us begin our examination of research topics in the 

! two fields. 

i • 

!"■ 

\ . * 
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A. Synthesis, Realization, and Implementation 

In this section we investigate one subject area in which some of the 
differences in perspective between the two disciplines are most apparent. 
Sp|ecifically, we consider the question of design. However, our discussion 

i ' 

will not deal very much with design methods but rather with the question of 
trying to pinpoint what researchers in the two disciplines mean by ''design” 
emd what sorts of problems their techniques are equipped to handle . 

Perhaps the most obvious difference between the fields is in the type 
of system representations used. In digital signal processing, the emphasis 
is hjeavily on input/output descriptions, while in control cind estimation the 
enphasis is more on state space models. Thejreasons for this difference 
stem from the different questions addressed by researchers in the two disci- 
plines. In digital signal processing one is interested in the issue of 
implementation of a system with a specified input-output behavior (hence | 
the need for an input-output description) . Questions such as efficient 
implementation and number of bits needed to achieve the desired level of 
accuracy are of great importance. 

On the other hand, in control and estimation theory the issue of 
implementatiori is not considered to neeirly the same extent. Realization 
techniques do address the question of constructing a state space realize- 

’ ' ' ; ; ^ I ' ' ' ■ 

tion that leads to a specified input-output behavior. ; However, as discussed 

! ; i I ' ' : ' : 

in the following svibsections , such techniques do not address many of the 

major issues involved in implementation, and, in fact, state space realiza- 
tions, when viewed as implementable algorithms, don't include some of the 
most important system structures that are used in digital system design. 
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NeyerthelesS/ state space models ^ play an important role in control and esti- 
mation system design. Specifically, a state spaqe model for a given physical 
system, is a necessary ingredient in the application of a number of tech- 

i 

niques for the analysis of system performance cind for the design of feed- 
back control or estimation systems (i.e. the specification of the desired 
input-output behavior of a control or estimation system) . 

Thus, we see some fundamental differences between the perspectives of 
researchers in the two disciplines. There also clearly exist several areas 
for interaction between the fields — to develop useful multi-input/multi- 
outiput structures (a maurriage of digital implementation and multivariable 
realization concepts), to utilize state space techniques to analyze the 

[. ■ .... , ’ i ' 

perfoinnance of digital filter structures, and to consider the digital ample- 

j 

mentation Of state-space control and estimation system designs. 

All of these issues are discussed in detail in this section. 

A.l State Space Realizations and State Space Design Techniques 

The basic realization problem is as follows: we are given a (possibly 
time-varying) description of the input/output behavior of a system 

k 

y0«) = S T(k,i) u(i) (A.1) 

i=0 

where u and y may both be vectors. In the time- invariant case we have that 
the sequence of impulse response matrices satisfies 

T(k,i) = T(k-i,0) = T (A-2) 

iC“l ... 

and in this case we may be given an alternative input/output description 
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in the transform domain 

00 

Y(z) » G(z)U( 2 ) , G(z) = 2 

i»o ^ 

The realization problem consists of finding a state space model 

x(k+l) “ A(k)x(k) + B(k)u(k) 
y(k> = C(k)x(k) + D(k)u(k) 

that yields the desired input/output behavior ((A.l) or (A. 3)) when 
x(0) = 0. 

The realization problem has been studied in detail in the control 
literature, auid one aspect that has received a great deal of attention 
is that of determining minimal realizations — i.e. models as in (A.4) 
with the dimension of x as small as possible. The basic idea here is that 
a minimal realization has no superfluous states that either cannot be 
affected by inputs or do not affect the output. These concepts lead 
directly to the notions of controllability and observability. In the time- 
invariant case, one obtains a rather complete description. Specifically, 
we find that the system (A. 3), has a finite-dimensional realization if and 
only if G(z) is rationeil with each element having no more zeroes th^ poles. 
Furthermore , any controllable and observable time-invariant realization is 
of minimal dimension, and any such minimal realization can be obtained 
from a particular one by change of basis (see, for example, [A-4,5,27,30] ) . 

The algorithm of Ho (A-293 and that of Silverman and Meadows [A-5] 
provide methods for extracting minimal constant realizations from the Hankel 
matrix determined by the T^ (see Subsection B.3 and the references for 


(A. 3) 


(A.4) 
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details of these results) . Thus, if we are given a design specification or 

plant description in terms of a rational G{z) , we can readily determine a 

minimal realization. On the other hand, if we are given 0 in the form (A. 3) 

as opposed to in rational form, partial realization algorithms must be used. 

We will discuss such algorithms in Subsection B.3. 

_ ■ '1 
State space realization algorithms can, in principle, solve certain 

questions related to system synthesis. Specifically, the computation of a 

minimal realization allows us to determine the minimal amoiint of storage 

reqxiired in any implementation, and one of the most important aspects of 

the state-space approach is that it allows one to consider multiple input/ 

multiple output systems and time-varying systems. Since any minimal = state 

I ' 

space realization can be obtained from a given one by change of coordinates, 
clearly realization theory allows soma flexibility in designing good digital 
filter structures. But it is far from the complete answer, as we w.11 see 
in the next subsection. Not only is memory becoming cheaper (thus reducing 
the importance of minimality) > but there are other implementation issues 
besides storage that are of importance, and one also runs into limitations 
in interpreting state space realizations as filter structures. 

A more important aspect of state space realizations comes from the fact 

' ■ , ! ... - ; 

that they play an extremely important part in a number of control and estima- 
tion design probl^s, where one uses state space realizations to model the 
system to be controlled or the signals^ to be filtered. By doing this, one 
can bring into play extremely powerful state space techniques for compensator 
design (A~2, 6], decoupling of the effects of different input channels [A-7] , 
etc., and we refer the reader to the special issue of the IEEE Transactions 


- 10 - 


on Automatic Control [A-32] for an overview of many design methods that have 
been developed. These design algorithms allow one to consider a variety of 
extremely complicated multivariable system problems within a single framework, 
and this ability to handle many variables at once is at the heart of the 
value of state space concepts. 

One important aspect of some of these techniques is that they allow one 
to solve quantitative optimization problems. The linear-quadratic optimal 
control problem is an excimple of this, as is the design of a Wiener filter 
as a steady-state Kalman filter [A-8;, 27] . In this case, we begin by modelling 
the observed signal as the additive white noise-corrupted output of a linear 
state-space model (a shaping filter) driven by white noise. Having solved 
this realization problem, the determination of the optimal Kalman filter is 
reduced to solving a time-varying Riccati equation or a nonlinear algebraic 
Riccati equation for the steady-state (Wiener) filter. Algorithms for solving 
this algebraic equation essentially solve the Wiener spectral factorization 
problem. 

In addition to providing a framework for the specification of designsi, 
the state space framework allows one to analyze the performance characteristics 
of the overall system aftdr it has been dlmpiemented:. For example, the tech- 

: i ' ' ' ' ' ' 

niques described in Section C can be used to study the stability characteristics 
of the system. Another 2 malytical tool used to study system performeince is 

■I : 1 

covariance analysis . Consider the modal 

x(k+l) = Ax(k) + w(k) , y(k) = Cx(k) + v(k) (A. 4) 

where w and v are zero mean, independent white noises, with variances' Q and R, 
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respectively. These noises may represent actual noise sources or the 
effects of small non-linearities, unmodeled phenomena, etc. A simple cal- 
culation yields an equation for the covariances P(k) and S(k) of x(k) and 
y(k) ; 

PJ[k+l) = AP(k)A* + Q, S(k) = CP(k)C + R (A. 5) 

I ! 

If A is a stable matrix, we can evaluate the steady-state covariances P 
and S by solving the Lyapunov equation 

APA'-P = -Q (A.6) 

A. 2 The Implementation of Digital Systems and Filters 

As discussed in [A-ll , the design of digital systems consists of | 

. I . ' ' _ . ' ; ' i 

several parts, including the specification of the desired input/output 
relationship and the implementation, using fjinite precision arithmetic, 
of a system that approximates this desired behavior. From this point of 
view, the methods of the preceding section deal with the first issue. 
Realization- procedures play an indirect role in these techniques in pro- 
viding the state space models on which the design methods are based. But 
wha,t about realizations from the point i of ; view of system synthesis and 
implementation?: As we shall see, state space realizations can play some 
role, but they are far from providing the entire solution. 

A wide variety of digital filter design methods have been developed to 
deal with the second issue. One factor that does enter into this design 
question is the number of storage elements (delays) in the filter structure, 
and thus the issue of minimality is of some importance. Of course, in 


/ 
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deaiing with single-input, single-output trsmsfer functions, one can read 
off the order of a canonic structure and can construct several quite 
easily by simple inspection of the specified transfer function. The deter- 

i' . I : ' ' ■ ■ ^ ; 

xnination of the order of a caiionic realization and the ability to construct 


several minimal realizations without much difficulty barely scratches the 


surface of the structures problem, however. As pointed out in CA-1] , the 
various filter structures available may be equivalent from an input-output 
viewpoint if one didn't have to worry about computation time, the complexity 
of the digital architecture or algorithm required to implement a given 


structure, the effect of finite precision in representing filter coeffi- 
cients, or the effects of overflow and quantization. These are the issues 
that motivate much of the study of various filter structures [A-1, 10,11]. 

Let us examine some of these issues in the context of a particularly 
-important structure, the cascade form , obtained by factoring a transfer- 
function H(z) as a product of lower-order transfer functions. Consider the 
example . 


, z^+(b+d)z+bd d+bz"^) (1-5-dz"^) 

H (z) = = {A. 7) 

z -(a-Kc)z+ac (l^az ) (1-cz ) 

In Figure A.l we have realized this filter as the cascade of two first 
order filters. Note that the overall filter is minimal. 

In Section C we consider the effects on digital filter performance of 
quantization and overflow on system stability. An alternative, approximate 
method for evaluating the effect of finite word length on system performance 
is to model each quantization as if it introduced noise into the system 







[A-1] . By assuming independence of these various sources — a rather strong 
amd sometimes unjustified ass\mption (as the existence of period effects, 
i.e. limit cycles, indicates) — one can in principle evaluate the overall 
noise power at the output, and thus can obtain a measure of the size of 
quantization effects. As an example, consider the case [A-11 of fixed-point 
arithmetic euid roundoff quantization in which the quantization interval q 
is 2 In this case, the quantization error e introduced by a single mul- 
tiplication takes on a value between ±.5q. If one makes the assumption 
that e is uniformly distributed, we find that it has zero mean and variance 
q /12. Then, for example, in the cascade filter of Figure A.l, one could 
add one such noise source following each of the four multiplications. 

Another extremely important issue in filter design is the sensitivity 
of filter performance to variation in coefficients. This is quite central 
an issue, since one can only represent coefficients up to a finite degree 
of accuracy,; and hence one cannot obtain filters with arbitrary pole and 
zero locations. The allowable poles and zeroes and the sensitivity to 
variations in parameters depend quite significantly on the particular struc- 
ture under consideration. For example, parallel and cascade structres are 
often used because the perturbations in the poles are isolated from one 
another. 

For the remainder of this section, we wish to examine the relationship 
of state space techniques and concepts to some of the questions in digital 
filter design. Let us first examine the use of state space techniques to 
determine filter structures . Consider the transfer function (A. 7) . In this 
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case, state space techniques yield a variety of minijnal realizations of the 


form 


x(k+l) = 


*11 *12 


^21 ^22 


x(k) + 


(A. 8) 


y(k) = [h^, h^] x(k) + u(k) 

I 

If we interpret (A. 8) as am algorithm, we must compute the various products , 

, . . . .1 

f..x.(k), g.u(k), h.x. (k) (i, j=l,2), and perform the appropriate additions. ^ 

Note that in general, there are 8 multiplications amd 6 additions required, 
j Now consider the cascade structure of Figure A.l. Interpreting it as 

ain algorithm (a and b multiply x^(k), c and d multiply x^Ck), and we perform 

I X 

the required additions) , we see that we require 4 multiplicatxons and 4 addi- 
tions, but this is not the most important difference between the two algorithms, 
since it is possible to obtain realizations (A. 8} with some zero elements in 
(F,g,h). However, the crucial difference is the following: if one interprets 
a state space realization as determining an algorithm of the type indicated. 


, ^ 

then there is no way that the cascade algorithm is of this type! This is not 


to say that one cannot find a state-space description of the cascade realiza- 


tion. In fact 


: a 0 1 

x(k+l) = x(k) + u(k) 

(a+b) c 1 


(A. 9) 


y (k) = [ (a+b) , (c+d)Ix(k) + u(k) 
is such a realization. 

The point made above may, at first glance, seem to be' trivial, but it is 


not, since it points out that although any (infinite precision) algorithm can 
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be I described dyneiniically in state space terms, direct interpretation of a 
sta-te space description as an algorithm does not allow one to consider all 
possible algorithms. That is, it is relatively easy to go from an algorithm 

I ' ■' 

to| a state-space description, but it is not at all natural or clear how to 
go the other way, and hindsight is needed in order to interpret a realiza- 
tion of the form of (A. 9) as a cascade structure . 

^ Thus, we see that state space models have limitations when one considers 
the issue of implementation. There are, however, several areas where inter- 
action between the two fields may be of use. First of all, the techniques 
used in digital signal processing should be of. use in considering the iraple- 
mentation of control and estimation system designs such as those mentxoned 
in Subsection A.l. Also, recall that state space realization techniques 

i 

allow one to determine minimal realizations for systems with multiple inputs 
and outputs. It is possible that this fact , combined with a thorough under- 
standing of the relationship between state-space realizations and various 
digital system structures will lead to the development of useful filter 
structures for multivariable systems. 

] ■ ' Also, as mentioned in the preceding subsection, the state space frame- 
work is particularly useful for the analysis of the properties of (dynamical 
systems. Thus, it seems natural to ask if these techniques might be useful 
in the analysis of various filter structures. In Section C we discuss this 
question with respect to stability analysis techniques. Also, it is possible 
that state-space sensitivity techniques [A-9] could be useful in the study 
of the sensitivity of various digital filter structures; but this awaits 
f\arther study. 
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Finally, let us examine the utility of state-space techniques in the 
analysis of the effect of quantization noise on filter performance. We do 
this by example, although it should be clear that this approach extends to 
arbitrary structures. Consider the cascade structure in Figure A.l where 

■I . . . ■ 

we add quantization noise after each multiplication. A state space repre- 
sentation of ' this system can be writte^v down by inspection: 


x(k+l) = Fx(k) + gu(k) + rN(k)' 


(AviO) 


y(k) = h'x(k) + u(k) + 4’N(k) 

where F, g, and h are given in (A. 9), N(k) is the 4-dimensional noise vector 
whose components are the noises contributed by the multiplications by a, b, 


c, and d, respectively. Then 'V - (1,1, 1,1), and 



1 0 
1 1 


0 0 
1 0 


(A, 11) 


If we make the usual independence assumptions concerning the components and 
time-behavior of N, we can directly apply the covariance analysis equations 
(A. 5), (A. 6) to determine the effect of quantization noise on x and y. Mote 
that (A. 5), (A. 6) yield the effect of noise throughout the network. The 
utility of an approach such as this for digital network analysis needs to be 
examined more carefully, but it appears that it may be computationally 
superior to other methods , such as those that use signal flow graph tech- 
niques [A-12] or that require computing a number of partial transfer functions 
[A-3]. We note, that Parker and Girard [A-15] used Lyapunov-type equations ! 
and analysis quite similar . to our development for the evaluation of output 
noise power due to correlated quantization errors. In addition, similar 
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analyses have been undertaken by Hwang [A-17] , Mullis and Roberts [A-181, 
amd Sripad and Snyder [A-19,20]. Hwang uses Lyapunov-state space equations 
to study the effects of possible structure transformations and state-amplitudjs 
spalings. Mullis and Roberts have obtained some significant results for 
digital filter design using a framework similar to Hwang's to study what 
they call "minimal noise realizations (see [A-31] for further developments) . 
Sripad and Snyder develop conditions under which quantization errors are in 
fact white, and they also use Lyapunov-type analysis to compare the perfor- 
mance of two different realizations. Within this framework, one can pose a 
number of other questions. For example, in the case of floating point 
arithmetic, the quantization error depends on the size of the signal. Can 
state-space procedures for analyzing "state-dependent noise" [A-161 be of 
value here? Questions such as these await future investigation,. 

In this section we have seen soma of the issues involved in system 
design in the two fields. The issue of implementation is at the very heart 
of the problems considered by researchers in digital signal processing, 
while researchers in control and estimation have concentrated more on the 
development of general design procedures for State space models and methods 
for analyzing the characteristics of such models. We have seen that there 
are points of overlap and places in which techniques and concepts from one 
discipline may be of value in the other. State space techniques may be 

I . • ■ . 

useful in the analysis of multivariable structures and in the analysis of 
sensitivity and quantization noise behavior of different structures. Such 
issues remain to be studied, but it is in the other direction that there is 
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the most to be done. The issues involved in the digital implementation of 
systems specified by state space design methods remain leurgely unexplored. 
Numerous problems abound, if^at is the effect of roundoff noise on closed- 
loop controller performance, and how many bits must we use to achieve the 
disired regulation properties [A-21,22,24,25,23]? It is well known that 
"optimal” controllers emd estimators require many arithmetic operations and 
hence lead to low sampling rates. Can we improve overall performance by 
using a simpler "suboptimal" system at a higher sampling rate [A-13] ? If ve 
are controlling a con^Jlex system, "optimal" controllers require not only a 
great deal of computation, but also the centralized processing of all infor- 
mation, and the cost of relaying information to a central location may be 
prohibitive. Can we devise decentralized control architectures that take 
advantage both of the structure of the dynamics of the system being controlled 
and the capabilities of the available types of digital processors? Here 
again, if we include the cost of information transfer, "suboptimal" decen- 
tralized systems may outperform the "optimal" system (see [A-14, 23, 26] for 
some results and problems concerned with parallel and distributed processing 
and decentralized control). 

The study of problems such as these — i.e., the interaction of imple- 
mentation and architecture issues and the design of control and estimation ' 
systems — is still in its infancy, and it appears to offer an extremely 
promising avenue for research. We note that architectural issues have 
received a great deal of attention in the field of digital signal processing 
[A-10, 12] , and this, together with the wealth of literature on digital filter 



! 
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structures, indicates that there is much to be gained from future interaction 
emd collaboration. 






- 21 - 


B. Parameter Identification, Linear Prediction, Least Squares, and 
Kalman Filtering 

A problem of great importance in many disciplines is the determination 
of the parameters of a model given observations of the physical process 
being modeled. In control theory this problem is often caLlled the system 
identification problem, and it arises in mamy contexts. The reader is 
referred to the special issue of the IEEE Transactions on Automatic Control 


[B-lOi and to the survey paper of Astrom and Eykhoff [B-11] for detailed 
discussions and numerous references. 

Parameter identification problems also arise in several digital signal 

I 

processing; applications. Several examples of such problems are given in 
the special issue of the Proceedings of the IEEE [B— 331, and one of these, 
the analysis, coding,- and synthesis of speech, has received a great deal of 
atl-ention in the past few years [B-15, 21-23]. We will use this problem as 

I . ' 

the basis for our discussion of the identification question. Our presenta- 
tion is necessarily brief and intuitive, and the reader is referred to these 
references for details. As discussed in [B-21] a popular approach is to 
model a discretized speech signal {y(k)} as the output of a linear system. 


which 


, over short enough intervals of time, can be represented by a time- 


invariant transfer function G(z) . Here the input is taken as a periodic 


pulse train (whose period is the pitch period) for voiced so\inds (such as 
vowels) and as white noise for unvoiced sounds (such as the sound ”sh”) . 

In addition, a common assumption is that G is an all-pole filter, which 
leads to an autoregressive (AR) model 

y(k) + a y(k-l) + ... + a y(k-p) - u(k) (B.l) 

X . . P 
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This assumption has been justified in the literature under many conditions, 
although strong nasal sounds require zeros [B-21] . 

The problem now is to determine the coefficients a , ...,a . Having 
these coefficients, one is in a position to solve a number of speech analysis 
and communication problems. For example, one can use the model (B.l) to 
estimate formant frequencies aind bcmdwidths, where the formants are the 
resonances 6f the vocal tract [B-24] . In addition, one can use the model 
(B.l) for efficient coding, transmission, and synthesis of speech [B-29] . 

The basic idea herS is the following: as the modalL (B.l) indicates, the 
speech signal y(k) contains highly redundant information, and a straight- 
forward transmission of the signal will require high channel capacity for 
accurate reconstzruction of speech. On the other hand, one Can interpret 
(B.l) as specifying a one-step predictor for y(k) in terms of preceding 
values of y (assuming u(k) = 0). As discussed in [3-29], one often requires 
far fewer bits to code the prediction error u than the‘ original signal y. 
Thus, one arrives at an efficient transmission scheme (linear predictive 
coding--LPC): given y, estimate the a^, compute u, transmit the and u. 

At the receiver, we then can use (B.l) to reconstruct y. An alternative 
interpretation of this procedure is the: following': given y, estimate G, 
pass y through the inverse, all zero: (moving average ~ MA) filterl/G(z),' 
transmit the coefficients in G and the. output of the inverse filter. At 
the receiver, we then; pass the received signal through G to recover y 
(-thus this procedure is causal and causally invertible). 

The question remains as to how one estimates the a^. The most widely 
used technique in the literature is linear prediction . Using the inter- 
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pretation of (B.l) , as specifying a one«step predictor for the signal y, 
we wish to choose the coefficients aj^,...,ap to nininiize the sum of squares 
of the prediction errors e(n) = y(n)-y(n), nel. Here we assume that we are , 

given y(0), ,y(N-l), while the set I can be chosen in different manners, 

and we will see in the following subsections that different choices can 
lead to different results and to different interpretations. 

Before beginning these inve.S'fcigations , let us carry out the minimization 
required in linear prediction. Taking the first derivative with respect to 
the a^ of the sum of squared errors, and setting this equal to zero, we 
obteU.n the normal equations 


P 

1*1 


k=l, . . . ,p 


(B.2) 


C.J. * 5] y(n-i)y(n-k) 
n£I 


(B.3) 


These equations are typical of the types of equations that arise in linear, 
least-squares problems, and their efficient solution has been the topic of 
many research efforts. 


B.l The : Autocorrelation Method , Kalman Filtering for Stationary Process , 
and Fast Algorithms 

Suppose we let T = all integers!, where we define y(n) = 0 for n < 0, 

I • , . I 

n > N. In this case, we find that 

N^l-Ii-j| . 

c. . = 2 y(h)y (n+li-j] ) = r(ji-jj) (B.4) 

n=0 

and the normal equations become Ta = d, where a' = (a^, . . . va^) , 
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d' = (-r(l), -r (2) , . . . ,-r{p) ) , and T| is a symmetric Toeplitz matrix 
lB-37,84,91] (i.e. the ij^^ element depends only on [i-jl> with T. . = c. .. 

We also note [B-15] that if y is a stationary random process with autocorre- 
lation 

R(i) = E[y(n)y(n+i) 1 (B-5) 

and if we want to find the one step prediction coefficients to minimize 
2 

E[e (n) 1 , we obtain an identical set of (Toeplitz) equations with r(i) 
replaced by R(i) . This statistical point of view is extremely useful for 
obtaining certain insights into the approach. 

The solution of such Toeplitz equations has been the subject of a great 
•deal of attention in the mathematical^ statistical, and engineering litera- 
ture [B-3.,4,15,17,18] . An efficient algorithm was proposed by Levinson 
[B-171 , improved upon by Durbin [B- 32] , and studied in the speech processing 
context by several authors, including ItaJcura and Saito [3-23]. The method 

• .. 1 i I : 

essentially consists of solving forward and backward prediction problems 
of increasing size in a recursive manner. That is, the algorithm computes 
the coefficients a(lji,. . . ,a(i|li) for the best prediction of y(n) based on 
y (n-l) , . . . ,y (n-i) and the coefficients b (1 | i, . . . ,b (i | i) for the best pre-' 
diction of yCn-i-1) based on y (n-i) , . . . ,y (n-1) . The algorithm iterates 
on i. As a part of this algorithm, one computes the prediction error (for i 

both forward, and backward prediction) , and thus one can determine when to 

I ’ i- ' 

stop] based on the size of this quantity. Also, we must compute a coeffi- 
cient k^, which is known as the partial correlation coefficient (see 
IB-15,21,23]) . 
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Let us now examine what this algorithm means 
of view. The algorithm specifies estimators of n 

i 

y(i) = - 5; a(jii)y(i-j) 

j*l 

i 

i 

y( 0 ) 

3=1 

Thus , we can think of the algorithm as providing 

coefficients of the weighting pattern of the opni 

I 

of the optimal initial time smoother * Note that 
general, time varying (i.e., a(j|i) ^ a(j)), sine 
tion is time-varying when one bases the predictie 
data. 

What does this mean as far as all-pole model 
goes? The answer to that is not much- In the a! 
we We equivalently only interested in designing 
prediction filter that produces the best estimats 
window” y(n-l) ,...,y(n-p) . The cpefficients of = 
adlp) , . . . ,a(p|p) , and it doesn't matter (excepn 
of view) that these coefficients were generated = 
filter weighting pattern. 

C5n the other hand, the time-varying weigh ti: 
is extremely important from a statistical point- ;: 
wishes to design recursive predictors that are z\ 
all past measurements and not just a data window 


from a statistical point 
he form 

(B.6) 

(B.7) 

us with the time -varying 
mal one-step predictor and 
these coefficients are , in- ■ 
;e the mechanism of predic- 
n on only a finite set of 


.ing via linear prediction 
,1-pole modeling problem, 
a FIR filter --i.e. a; 

! of y(n) given the "data 
luch a filter are precisely 
from a computational point 
iS part of a time-varying 

•.g pattern interpretation 
if view, especially if one 
itabla of incorporating 
in the case when y has a 
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Markovian representation 

, x(k+l) = Ax(k) + w(k), y(k) = c'x(k) 


(B.8) 


where x is a random n-vector, A is a constant ,nxn matrix, c is a constant 
n-vector, and w is a zero-meam uncorrelated sequence ^ith covariance Q. 

The correlation coefficients of y can be computed by direct examination of 
(B.8). We note that x and y will be stationary with 


R(i) = c'A fie i > 0 


(B.9) 


if A is stable and if II, the covariance of x, satisfies the Lyapunov equation 


AiiA' - n = -Q 


(B.IO) 


We now wish to design an optimal predictor for recursively estimating y(n). 
This is a Wandard estimation problem, and the solution is the Kalman filter 
[B-4] : 


x(n) = Ax(n-l) + AK(n-l}y:(nr:l) , y(n) - c'x(n) 
Y(n-l) = y(n-l> - y(n-l) 


(B.ll) 


where the time-varying gain satisfies 


. P(ntn-l)c 
K(n) = -Tc ' 


(B.12) 


c'P(njn-l)c 

Here P(n}n-1) is the covariance of the prediction error x(n) - x(n), 

, , I , I 1 X ^ « AP(n|n-l)cc'?(n|n-l) A* rn nv 

PCn+ljn) = AP(nln-l)A' + Q c'P(n n-l)c ~~ , (B.13) 

Let us make a few comments about these equations. Note that the filter 
innovations y(n) is precisely the prediction error e(n), and its variance 
is c'P(n[n-l)c. Also, recall that in the all -pole framework, we could alter- 



natively view the prediction filter as speeilyinc an inverse filter, which 

( : r ; ' ' 

took the y's as inputs and produced the uncorrelated sequence of prediction 
errors as the output. In the context of the Kalrian filter, the auialogous 
filter is the innovations representation (see representation lR-1 of [B-27]), 
in which we view the output of (B.ll) as being Y(n). Finally, note that 
one can compute the predictor coefficients a(j;i) as the weighting pattern 
of the filter (B.ll). 

Examining (B. 11)-(B.13) , we see that the computation of the recursive 
filter coefficients requires the solution of the (discrete time) Riccati 
equation (B.13). If x is an n-vector, then (using the fact that P is 
symmetric), (B.13) represents n(n+l)/2 equations. For reasonably large 
values of n^ this can be an ejctreme computational load, especially given 
that all that is heeded for the filter is the nxm gain matrix K (when y is 
m-dimensional) . Thus when m « n, the question of computing K without P 
arises quite naturally, and this issue — in both continuous and discrete 
time, in stationary and in some nonstationary cases — has been the subject 
of rtimerous papers in the recent past [B-l-S] . : “he underlying concepts that 
have led to these "fast algorithms" (at least' ijn the stationary case) , are 
the same as those that lead to the Levinson algorithm. For some historical 
and mathematical perspective on this subject, , we refer the reader to [B-3,41. 
In particular, the extension of the Levinson aigorithm to the multivariable 
case is discussed in these papers (see also reference [B-13] ) . In this 
case, the matrix T in the normal equation^ i^ olock-Toeplitz, and the exten- 
sion to this case is decidedly nontrivial. 

There are a number of deep mathematical and pnysical insights that can 
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be obtained by the examination of these fast aigotithms. As discussed in 
[B-15,21], the Levinson algorithm involves an au:<iiiary partial correlation 
coefficient k^, which has an interpretation as a rsf lection coefficient, 
and this fact has been utilized in speech processing, in which these coeffi- 


cients specify certain parameters in an acoustic 
[B-15,21]. In addition Casti and Tse [B-20] , Ka; 
Casti [B-8] have shown that the fast Kalman gain 
related to the work of certain astrophysicists, i 
[B-19] , who devised algorithms for solving finit? 
arising in radiative transfer. Also, relationshi 
auid scattering theory have been brought to lighc 
And finally, for a good overview of some of the - 
including some with the theory of orthogonal polj 
reader to [B-4,42]. These ideas are of interest 
algorithms from several perspectives allows us t 
properties, potentials, and limitations. 


mc-del of the speech process 
lath [B-1,4] and Sidhu and 
algorithms are closely 
n particular Chandrasekhar 
time Wiener-Hopf equations 
ps between linear filtering 
in the recent papers [B-34,35] 
athematical relationships , 
jT.cmials, we refer the 
in that seeing these 
gain insight into their 


B.2 The Covariance Method, Recursive Least Squares Identification, and 

Kalman Filters 

Consider again the normal equations (B.2), (5.3). We now consider the 
range of n to be only as large as the actual data allows — i.e., in equa- 
tion (B.l) we will require that k, k-l,...,k-p all be within the range 
0,. .. ,N-1. This leads tO the restriction p ^ r. < ::-l. Also, in this case 
c^j is not in general a function of i-j, the manrix T is symmetric but not 
Toeplitz, and the fast methods of the preceding s'ubsection don’t carry over 
quite so nicely. Recently , "however , Morf, et al. [3-30] have obtained fast 
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algorithms for the covariance method by exploiting the fact that, although 
T is not Toeplitz, it is the product of Toeplitz matrices. 

Let us take a look at the covariance method from a slightly different 

•i ! 

point of view. Recall that the algorithm mentioned above and the one in the 

i 

preceding subsection involve recursions on the order of the filter given a 
fixed set of data. Suppose now we consider a rertrsion for updating- coeffi- 
cients Of a fixed order filter given more and more data. To do this, we 
refer to [B-11 ] , where the covariance method is discussed. Given the data 
y (0) , . . . ,y (N-1) , the covariance method attempos no fit a model of the form 
of (B.l) by! finding a least squares fit a(N) to ohe equation 


^-1^ “ ^N-1 (B.14) 

where a' = (a,, — , a ), f'„ , = (y (p) , . . . , vC:-!) ; , and L has various y(i). 
i 1 p N-i 

as its elements. Suppose we have a (N-1) and we now obtain the new data 
point y(N) . We would like to update our estimare in a manner more ; efficient 
than re-solving (B.14) from scratch. Following standard recursive least 
squares (RLS) procedures [B-11] , we find that (here (N) is the last row 
of L^): 

a(N) = a(N-l) + K(N) [y(N)-i'(N)a(N-i}: s i(N-l) + K(N)r(N) 

(B.15) 

P(N-1)S.(N) 


K(N) = 


1+ii’ (N)P(N-l))l (N) 


PW, . = 

N N 


l+V {N)?(::-l)i(N) 


(B.16) 

(B.17) 


Examining these equations, wa see that they can be interpreted as 
defining a Kalman filter (see [B-12] ) . In fact, referring to [B-141 , we 
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see that these are precisely the Kalman filter equations used by Melsa, et al. 
in speech processing. Specifically, they consider the dynamic equations 

a(k+l) = a(k), y(k) = i'(k)a(k) + v(k) (B.18) 

j 

where v(k) is a zero-mean, white process with variance If 'F is set to 1, 
we obtain (B.15)-(B.17) . Also, in this formulation, P(N) has the interpre- 
tation as the covariance of the estimation error a-a (N) . 

Let us note some of the properties of the recursive solution (B.15)- 

I ■ " ^ i ..) I 

(B.17). Examining (B.15), we see that the increment in our estimate a is i 
proportional to the error (innovations) in predicting thei latest value of y 
using preceding values and our previous estimate of a. This suggests that 
a monitoring of the innovations r(N) can be used to help detect abrupt 
changes in the predictor coefficients or the presence of glottal excitatioh 

in voiced soiands. In this manner one may be able to improve upon the esti- 

! , 

mat ion of a. Whether such a procedure would be of value is a matter for 
future j study. Also, it is possible to make the filter more responsive to 
changes in the coefficients by using one of several methods available for 
adjusting Kalman filter [B-41] . These include exponentially age-weighting 
old data in favor of the more recent pieces of information or the modeling 
of a as a slowly- varying Markov process. In addition, the formulation (B.18) 
provides a method for developing an analysis system for noise-degraded speech 
(i.e.,^the case when ¥ > i) . 

Let us now consider the computational! complexity of (3.15) - (B.17) . First 
note that one does not have to compute the correlation coefficients. However, 
one does have to calcualte K(N) at every stage, and if one solves for the gain 
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from the Riccati equation (B.62) , one has on the order of p multiplications 

per stage. However, Morf, et al. [B-30] and Morf and Ljung [B-40] have 

! 

exploited the structure of the equations to obtain fast algorithms for the 
direction computation of K. Combined with the fast algorithms mentioned 
earlier, one now has efficient recursive procedures for the covariance method 
as one increases either the order p of the predictor or the number N of data 
points . 

B.3 Design of a Predictor as a Stochastic Realization Problem 

A problem that has attracted a great deal of attention in the I control 
and estimation literature is the stochastic realization problem [B-4,8-10, 
13,27]. Briefly stated, the stochastic! realization problem asks the 
following: given a stationary Gaussian random process y with correlation 
function R(n) , find a Markovian representation 

x(n+l) = Ax(n) + w(n) , y(n) = c'x(n) (B.19) 

I.. ■ , 

where w is a , zero mean white noise process with covariance Q. Referring 

r ■ ■■ 

to (B.8)-(B.10) , we see that this is equivalent to finding a f actorization r 

R(i) = c'A^b i > 0 (B.20) 

b =.Pc, APA'-P = -Q (B.21) 

Examining (B.20), (b. 21), we see that the algorithm falls naturally into 
two pieces: (1) find a triple (A,b,c) satisfying (B.20); (2) find P and Q 
satisfying (B.21). One of the best-known studies of this problem is that of 
Faurre [B-13,25]. As he pointed out, the first step of the algorithm is 
simply the well-known deterministic realization problem when one is given 
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the "weighting pattern" R(0) , R(l) , R(2) . This problem has been widely 

studied in the literature [A-30, 31, B-6,9] and we will make a few comments 
about this aspect of the problem in a few moments. Before discussing the 
mamerical aspects of the first step or the details of the second, let us 
see what the first part yields in the frequency domain [B-261 . Let 

+«o 

S (z) = 5 R(i)z‘^ (B.22) 

y . ^ , 

l=-oo ' 

Then, we see that the factorization (B.20) yields 

j - Sy(z) = c' (zI-A)"^zb + c' (z'^I-A)"^Ab (B.23) 

I 

I 

Noting the form of (B.23), and defining a(z) = det(zI-A) we see that the 
first step in the algorithm yields 

3 (Z) = , • - , :.(B.-24) 

^ a(z)a(z ■^) 

That is, we have obtained a factorization of the denominator of S^. If we 
can also factor the numerator we will have determined the desired trams fer 
fvinction 8(z)/a(z), which, when driven by whits noise, yields the spectrum 
Sy(z). It is clear from (B.19) that it is this second part of the spectral 
factorization that is accomplished by the second step of the stochastic 
realization algorithm. Finally, note that the model obtained contains both 
poles and zeroes. 

There are several methods for performing the second step of the 
algorithm. Faurre [B-13] showed that (B.21) could be solved for values 
of P inside a given range, and he identified the smallest such covariance, 

P^, as that arising from an innovations representation of y — ■ i.e., a steady- 
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state Kalman filter. Thus to complete the second step v;e can either solve 
an algebraic Riccati equation or can use the "fast algorithms", as described 
in Subsection B.l to compute the time-vaurying Kalman gain. Letting the 
transients die out, we then obtain the desired steady-state filter. Although 
this approach involves solving for the entire gain time history, this proce- 
dure may be faster than direct solution of the algebraic Riccati equation. 

Let us now turn to the nxomerical aspects of the first stage — i.e. the 
computation of the factorization (B.20). The algorithms of Rissanen [B-8] 
and Ho tA-29] are based on the examination of the Hankel matrix 

R(0) R(l) R(2) R(N-l) 

R(l) "R(2) R(3) R(N) 


R(N-l) RCN) R(N+1) R(2N-2 

It is well-known [B-36] (see also Subsection A.l) that R admits a factoriza- 
tion (B.20) if and only if there is some integer n such that 

rahk < n for all N (B. 26) 

N — , - ^ 

Ho's original algorithm yielded a minimal realization (i.e. dim A in (B.20) 
is as small as possible) if a bound n was known, in advance. A far more 
critical question (from a practical point of view) is the partial realization 
question . Here we take into account that we only have available a finite 
number of correlations R(0) , R(l) , . . . , R(N-l) , and one would like to obtain 
the minimal factorization that matches these. One can use Ho’s algorithm 
for this, but it is not recursive — i.e. if we incorporate R(N) , we must 
re-solve the whole problem. Fortunately, Rissanen [B-8] and Dickinson, et al. 
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[B-61 have developed efficient, recursive procedures (the latter of which 
is based on the Berlekamp-Massey algorithm [B-7] , which was developed for 

I 

the scalar case) . We note that these algorithms essentially solve the Fade 
approximation problem, and we refer the reader to the references for details. 

Thus, efficient algorithms exist for spectral factorization and one 
wCuld expect good results if the process y truly has a Markovian representa- 
tion and if one has the exact values of the correlations . This points out 
a conceptual difference between linear prediction and the above stochastic 
realization procedure. In linear prediction, no pretense is made about 
exactly matching a model. All that is wanted is a least-squares .fit, and 
thus one would expect this procedure to be relatively robust when one uses 
a finite record of real data to generate an estimate of the correlation 
function which is then used in the linear prediction procedure. On the 
other hand, it can easily be seen that an infinitesimal perturbation of 
in (B.25) canimake it have full rank. In this case^ the partial realization 
procedures — which in essence are looking to match a model exactly will 
yield a system of extremely high dimension. Thus, it appears that these 
algorithms 'are inherently sensitive to errors in estimates of the correlation 
coefficients. In addition, if y has no Markovian representation, the linear 
prediction approach will still work fine, but the partial realization pro- 
cedure may very well run astray as it tries to it the data "too closely". 

Does this mean that the above procedure is of no use in identifying 
parameters in a speech model? The answer to that is perhaps not. What is 
needed is a modification of the first step of the stochastic realization 
algorithm. As the version described here stands, in is too sensitive and 




in fact, DeJong [B-37] has shown that these methods are numerically unstable 
in that the inexact minimal realization supplied by these algorithms, as 
implemented on a finite wordlength computer, may not be a "numerical neigh- 
bor" of the sequence {R(i)} that is to be factored. By rephrasing the 
algorithm in terms of e-rank — the least rank of all systems within an 
” e-neighborhood” of the given sequence — DeJong obtains a slower algorithm 

i 

that is similar to Rissanen's but is numerically stable. This approach is 
extremely appealing for two reasons: (1) We can, within this framework, seek 
minimal realizations in the £ -neighborhood of a sequence '{R(i) } that itself 
is not realizable by a finite dimensional system; (2) We can seek the "nearest 
reduced-order realization of given dimension of a given system. 

In addition to the work of DeJong, a number of other methods have been 
proposed for "approximate" Fade approximations [B-31, 38, 39]. One interesting 
possibility is the all-pole approximations — i. a. , we perform linear predic- 
tion on the R(i] . This would require computing the correlation of the R(i) ! 

^ i ,,, , ' ■■ ■ 

. 1 •_ _ ■ - ■ ■ ■ ■ , ' ' . 

(Note that an all-pole “assumption here would not necessarily lead to an all- 
pole model in (B.19)) . 

One of our goals in this section has been to point out a number of simi- 
larities between the goals and techniques of the two disciplines. We have 
also seen some of the differences, but others have not been discussed. In 
particular,' in this section we have treated identification for identifica- 
tion's sake. As pointed out in [B-ll] .in control system design, identifica- 
tion is often simply a means toward the goal of efficient control. Thus, in 
many control applications, the value of identification is not measured by 
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the accuracy of the parameter estimates, bu1: rather by the performance of 
the overall system. In addition, in control there are several types of 
identification problems, since one has the opportunity to excite the system 
through inputs. Different problems arise if the system is operating open 
loop, in a time -invariant closed-loop mode-, or in an adaptive closed loop ^ 
mode. We refer the reader to [B-10,12] for more on this subject and for 
further references. In addition, in many on-line control problems the 
number of parameters to be identified is not very large — four or five. 

In fact, one of the key problems in practical adaptive control is the care- 
ful choosing of which few pareimeters to identify. 

On the digital filtering side, one is often interested in the accuracy 
cjf the parameter estimates. This is of importance, fdr example, if one is 
attempting to design an all-pole filter that matches a given impulse response 
in a least squares sense, or if one is attempting to estimate formants from 
a!n all -pole speech model. On the other hand, for linear predictive coding^ 
the accuracy, of the parameters may be Of secondary interest , while the 
primary concern is more efficient coding of speech data. In this case, : i 

accuracy is of importance only in so far as it makes the coding scheme more 
efficient. Also, in the speech problem, we are usually dealing with many 
unknown parameters — between twelve and sixteen [3-21]. 

• With regard to the speech problem, we note that linear, prediction has 
proven to be a particularly appropriate tool for a variety of reasons, ranging 

I ; ; . . . i 

from the fact that the all-pole model is often a realistic one to the property 
that the linear prediction procedure tends to match the spectral envelope of 
the data [B-15] . In this section we have described a number of related 


identification concepts (see [Z-1] for more) , some of which may be useful 
in solving problems in speech analysis, such as enhancement of noise-degraded 
speech. We have also pointed out a number of questions concerning some of 

these methods, such as the need for detailed numerical analyses of the many 

i 

"fast" algorithms, and the necessity of further analysis and experimentation 
to assess whether any of these techniques can improve upon the performance 
achievable in speech processing using linear prediction. 
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C. Stability Analysis 

I In the field of digital signal processing, stability issues arise 

I ^ ^ 

when one considers the consequences of finite v?ord length in digital' 
filters. On the one hand, a digital filter necessarily has finite range, 
and thus overflows can occur, while on the other, one is inevitably faced 
with the problem of numerical quantization — roundoff or truncation. 

Sines the filter has finite range, the questioh of the state of the filter 
growing without bound is irrelevant. However, the nonlinearities in the 
filter, introduced by whatever form of finite arithemtic is used, can 
cause zero-input limit cycles and can also lead to discrepancies between 
■|he ideal and actual response of the filter to certain inputs. Following 
the discussions in lC-3, 9], the typi^)ar 'situation can be described as 
follows „ 


x(n+l) = AxCn) + BuCn) , yCn) = Cx(n) 


(C.l) 


XCn) = NCx(n> ) 


wh^re N is a nonlinear, memoryless function that accounts for the effects 
of 'overflow and quantization. If one assumes that the associated linear 
system (i.e. , N= identity) is designed to meet certain specifications, 
one would like to know how the nonlinearity N affects overall performance. 
For example, filter designers are interested in determining bounds on the 
magnitudes of“ limit cycles and in finding out how many bits one needs to 
keep the magnitudes of such oscillations within tolerable limits. 

On the other side, a typical feedback control system is described by 


y = G^(e) , 


e - u-G^Cy) 


(C.2) 
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where the input u, the output y, and the error s are functions of time, 
and emd represent the dynamics of the forward and feedback paths, 
respectively. In control theory one is interested either in the 
analysis or the synthesis of such systems ^ In the synthesis problem we 


are given an open loop system G^^ cuid are asked to design a feedback 
system such that the overall system has certain desirable st^ility 
properties. In the case of stability analysis, one may be interested 
either in the driven or the undriven characteristics. In the driven 
case the problem involves determining if bounded inputs lead to bounded 
outputs and if small changes in u lead to small changes inthb y. In 
the undriven case, we are interested in seeing if the system response 
decays, remains bounded, or diverges when the only perturbing influences 


are initial conditions. 


it is clear that the problems of interest to researt'.ers in both dis- 


ciplines have a good, deal in common, and, as we shall see, workers in each 
area have obtained results by drawing from very similar bags of mathematical 
tricks. However, there are differences between the methods used and results 

i " ! 

obtained in the two areas. In the analysis of digital filters the work has 


been characterized by the study of systems containing quite specific non- 
linearities. i In addition, much of the work has dealt with specific filter 
structure. In particulcur, second-order filters have received a great deal 
of attention [C-2,3,7,9], since more complex filters can be built out of 
seres - parallel interconnections of such sections. Also, the class of 


wave digital filters [G-5,6] have been studied in some detail. Studies in 
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these areas have yielded extremely detailed descriptions of regions of 
stadaility in parameter space and numerous upper and lower bounds on limit 
cycle magnitudes (see [C-3,4,13,24-26] ) . 

In control theory, on the other hand, the recent trend has been in the 
development of iJ^uher general theories, concepts, and techniques for sta- 
bility analysis. A nxamber of rather powerful mathematical techniques have 
been developed, but there has not been as much attention paid to obtaining 
tight bounds for specific problems. In addition, problems involving limit 
cycles have not received nearly as much attention in recent years as issues 
such as bouhded-tinput , bounded -output stability and global asymptotic sta- 

i ■ 

bility (although there clearly is a relationship between these issues and 
limit cycles) i. 

C.l The Use of Lyapunov Theory 

The technique of constructing Lyapunov functions to prove the stability 
of dynamical systems has been used by researchers in both fields (gee (C-221 
for details and further discussions) . Consider a system with state x(k) and 
with equilibrium point x = 0. A Lyapunov function V(x) for this system is 
a scalar function for which V(0) = 0 and which is nonincreasing along system 
trajectories (i.e. V(x(k)) is nonincreasing as a function of time). 

If this fTinction has some additional properties, we can prove stability 
or ihstcibility. Basically, we think of V as an "energy" function. One then 
obtains results depending upon how energy behaves along trajectories. In- 
tuitively, if V is. everywhere positive except at x = 0 and V(x(k)) decreases 
monotonically , the system dissipates energy and is stable. On the other 
hand, if V(Xq) <0 for some Xq, then the system cannot be asymptotically 
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stable, since the nbnincreasing nature of V(x(k)) guarantees that the 
system can't approach the zero energy state if started at Xq* One advan- 
tage of Lyapunov-type results is that the hypotheses for results such as 
those just mentioned can be checked without the construction of explicit 
solutions to difference or differential equations. However, the major 
problem with the theory is the difficulty in finding Lyapunov functions 
in general . 


With respect to the lindt cycle problem, v;illson [C-2,8] has utilized 


Lyapunov functions to determine conditions under which second order digital 


filters will not have overflow limit cycles and will respond to "small" 

I 

I 

inputs in a manner that is asymptotically close to the ideal response. 


Parker and Hess [C-13] and Johnson and Lack fC-25,26] have used Lyapvinov 
fxinctions to obtain bounds on the magnitude of limit cylces. In each of 
these the Lyapunov function used was a quadratic form which in fact proved 
asymptotic stability for the ideal linear system. In Willson's work [C-8] , 
he' was able. to show that his results were in some sense tight by constructing 
counterexamples when his condition was violated. In [C-13, 25, 26] the bounds 


are not 


as good as others that have been found, 'and, as Parker and Hess 


state, this .may be due to the difficulty of determining which quadratic 
Lyapunov function to use. As pointed out by Claasen, et al. , [C-3] , it 
appears to be difficult to find appropriate Lyapunov functions for the 


discontinuous nonlinearities that characterize quantization. 

There is a class of digital filters — ■ wave digital filters (WDF) 


fC-5,6] — for which Lyapunov techniques are particularly useful. Such fil- 


ters have been developed by Fettv/eis so that they possess many of the pro- 
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perties of classical analog filters. Motivated by these analogies, Fett- 
weis [C-5] defines the notion of "instantaneous pssudopower" , which is a 
particular quadratic form in the state of the ^*7DF. By defining the notion 
of "pseudopassivity" of such a filter, Fettweis introduces the notion of 
dissipativeness. With this framework, the pseudopower becomes a natural 
candidate for a Lyapunov function, and in [C-6] , Fettweis and Meerkotter 
are able to apply standard Lyapunov arguments to obtain conditions on the 
arithmetic used that guarantee the asymptotic stability of pseudopassive 
WDF's, The introduction of the concept of dissipativeness in the study of 
stability is an often-used idea (see the note of Deoser [C-141), and a 
number of important stability results have as their basis some notion of : 
passivity. We also note that the use of passivity concepts and the tools 
of Lyapunov theory appear to be of some value in the development of new 
digital filter structures that behave well in the presence of quantization 
[C-7] . 

Lyapunov concepts have found numerous applications in control theory. 

The construction of quadratic Lyapunov equations for linear systems is v#'oll 
understood and is described in detail in [C-22] . The key result in this 
area is the following. Consider the discrete-time system 

x(k+l) = Ax(k) (C.3) 

This .system is asymptotically stable if and only if for any positive definite 
matrix L, the solution Q of the (discrete) Lyapunov equation 

A'QA - Q = -L (C.4) 

is also positive definite. In this case the function x'Qx is a Lyapunov 
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function that proves the asymptotic stability of (C.3). Note that this 
result provides a variety of choices for Lyap’^iov functions (we can choose 
cuiy L > 0 in (C.4)). Parker and Hess [C-13] obtain their bounds by choosing 
L = I (here (C.3) represents the ideal linear model). Tighter bounds might 
be possible with other choices of L, but, as they mention, it is not at all 
clear how one would go about finding a "better" choice (other than by trial 
and error) . 

In addition to their direct use in specific applications, one of the 
most important uses of Lyapunov concepts is as an intermediate step in the 
development of other more explicit results. ~cr example, the stability of 
optimal linear regulators with quadratic criteria and of optimal linear 
estimators can be proven by constructing particularly natural quadratic 
Lyapunov functions [B-41 , C-32] . A further use of Lyapunov theory has been 
to provide a framework for the development of Ctany more explicit stability 
criteria. Examples of these are a inumber of the frequency domain stability 
criteria that have been developed in the last 13 to 15 years (see [C-10-12, 
15,16,20,211). These results are the subject of the next svibsectiqn. 

C.2 Frequency Domain Criteria, Passivity, and Lyapunov Functions 

We have already mentioned that the notion of passivity is of impor- 
tance in stability theory and have seen! that Fettwais and Meerkotter have 
been- able to utilize passivity notions to study certain digital filters 
via Lyapunov techniques. The relationship becwaen passivity, Lyapunov 
functions, and many of the frequency domain criteria of stability theory 
is quite deep, and in this subsection we wish to illustrate some of these 



-44- 




ideas. We refer the reader to the work of J.C. Willems [C-19,23,30] , in 
I . ■ 

particular, for a detailed development. The general development in these 

references is beyond the scope of this paper, but we will indicate some 

of the basic ideas for a discrete-time system, denoted by the symbol G, 

with input u and output y. In this case, one can define input/output (I/O) 

stability as 

. - . CO ■ ' 00 

'Z uj < “ => S y? < “ (C.5) 

i=l i=l ^ 

i.e., if the input has finite energy, so iioes the output. If we Ccm. make 


the stronger statement 




(C.6) 


we call K the I/O gain. A system is called passive if 

I . 

if there is 'an £>0 such that) 



I 

i 

I 



for all u^, M 


(C.7) 


The motivation for the definition (C.7) stems from the interpretation of 
the left-hand side of (C.7) as the total energy innut to the system. Thus 
a passive system always requiresi a positive amount of energy to be fed into 
it. 'This notion has extremely strong ties to the usual notions of passivity 
and dissipativeness for electrical networks andis, in fact, a natural 
generalization of these concepts [C-30,34]. 

Having this framework, one can derive important results on the stability 
and passivity of feedback interconnections of passive systems (see [C-301 ) , 
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much like the results of Fettweis for his pseudopassive blocks. As out- 
lined by Willems in [C-30] , there are three basic stability principles. 

The first involves the interconnection of passive systems as mentioned 
ahove, while the second is the small loop gain theorem (stability arises 

if the gain around the loop is less than unity — a result used in the 

I _ ■ i ^ 

digital filter context in [C-31] ) . The third result involves notions of 

passivity and of sector nonlinearities. A nonlinearity is inside ( strictly 

inside ) the sector [a,b] , if its graph is bounded by (strictly contained 

vlithin) the lines y -ax and y = bx. Thus, the effective gain of this non- 
i _ ^ - 

linearity is between a and b. As an example, the operation of roundoff 

is inside the sector [0,2] (see [C-3,9] for the sector characteristics of 

other quantizers). To indicate how sector nonlinearity conditions can be 

used, consider (C.2) with specified by a stable discrete time transfer 

function G(z) , and G^ a memoryless nonlinearity, f, assumed to be inside 

the sector [0,k]. In this case, the general sector I/O stability theorem 

reduces to showing that (G. + h) is a passive system, and, as developed 

• X JC 

in [C-19,30], this will be the case if and only if G(z) + ^ is positive 
real. 


Re (G Ce^“) ) + ^ > 0 


CO e to, 2tt) 


(C.8) 


which is Tsypkin's stability criterion (C-33]. 

A variant of this type of result involves the use of multipliers [A-19] 
in which one modifies the feedback system of (C.2) by inserting a second 
system in the forward path and its inverse in the feedback path. One can 
then apply the basic stability results to the modified G^ and 02* In this 
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manner one obtains Popov's stability condition [C-16] and the discrete- 
time analog due to Tsypkin [C-12, 20]: suppose f is nondecreasing and is 
strictly inside the sector [0,k]. Then the feedback system is finite gain 
I/O stable, if there exists an a ^ 0 such that. 

ReCd + a(l - e"^“)) G(e^“)] + ^ > 0 ¥ w £ [0, 2 tt] (C.9) 

Claasen, et al. [C-9] have obtained direct analogs of (C.8) and (C.9) 
for the cdasence of limit cycles of period N: 

Re(G(e^^^^'^^] + J- > 0, i = 0, 1, N-1 (C.IO) 

1C 

or the existence of a >0 such that 

P -■ 

Hettl + T o (1 - i > 0 (C.ll) 

A 



Here f is inside the sector [0,k] and is also nondecreasing in the case of 
(C.ll). The proofs given in [C"9] rely heavily on the passivity relations 
(C.IO), (C.ll) cuid cin application of Parseval’s theorem in order to contra- 
dict the existence of a limit cycle of period 'N, This last step involves 
the assumed periodicity in a crucial way, but: the application of Parseval 
aoid the use of the positive real relationship (C;. 10) is very reminiscent 
of stability arguments in feedback control theory [C-19] . In the proof 
of (C.ll) the monotohicity of f is used in conjunction with a version of 
the rearrangement inequality [C-18,191 which has also been used to study 
stability of feedback control systems. 

As mentioned at the end of the preceding subsection, many frequency 
domain results can be derived with Lyapunov-type arguments. We have also 
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seen in this subsection that many of these results can be derived via 
passivity arguments. Clearly the two are related, and the crucial result 
that leads to this relationship is the Kalman-Yacubovich-Popov lemma 
CC-27,28,30] , which relates the positive realness of certain transfer 
functions to the existence of solutions to pcurticular matrix equalities 

aind inequalities. Kalman [C-28] utilized this result to obtain a Lyapunov- 

i. . . ' ■ ; 

i 

type proof of the Popov criterion, and Szego [C-27] used a discrete-time 
version to obtain a Lyapunov-theoretic proof of TfSTOkin's criterion. We 

' i 

also note that the positive real lemma plays a crucial role in several ^ 
other problem areas including the stochastic realization and spectral 
factorization' problem [B-13] and the study of algebraic Riccati equation 
[C-29] . 

Finally, we note that many of these passividy-LyapTonov results have 
instability counterparts (e.g., see [C-1,171). Such results may be useful 
in developing sufficient conditions for the existence of limit cycles. 

In this section we have considered some cf the aspects of stability 
theory that point out the relationship among the techniques, goals, and 
results of researchers in both disciplines. As we have seen , many of the 
results in the two disciplines involve the use of very similar mathematical 
tools. On the other hand, the perspectives and goals of ^ researchers in-, 
the! two fields are somewhat different. The development of ;a mutual under- 
standing of these perspectives and goals can only benefit researchers in ; 
both fields and is in fact absolutely crucial for the successful study of 
certain problems. For example, in the implementation of digital control 
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systems one must come to grips with problems introduced by quantization. 
Digital controller limit cycles at frequencies near the resonances of the 
plant being controlled can lead to serious problems. In addition, the use 
of a digital filter in a feedback control loop creates new quantization 
analysis problems. Finite arithmetic limit cycles can occur only in recur- 
sive (infinite impulse response) filters. Hov;sver, if a nonrecursive (finite 
impulse response) filter is used in a feedback control system, quantization 
errors it produces can lead to limit cycles of the closed-loop system [C-31] . 
How can one analyze this situation, and how does one ta.ke quantization 
effects into account in digital control system design? Questions such as 
these await future investigation. 
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D. , Multiparameter Systems, Distributed Processes, and Random Fields 
A growing interest has developed over the past few years into 
problems involving signals and systems that depend on more than one 
independent variable. In this section we consider several problem 
areas involving multiparameter signals and systems in order to examine 
some of the key issues that arise. For an up-to-date view of some of 
the research in this area, we refer the reader to the recent special 



D. 1 Two Dimensional Systems and Filters 


In analogy with the 1-D case, a 2-D linear shift invariant (LSI) 

] , 

system can be described by a convolution of the input x(m,n) and the 

unit impulse response L(m,n). Alternatively, taking 2-D z-transforms , 

we obtain 


Y(z^,z^) (D.l) 

Of special interest are the rational system functions, H=A/B, which 
arise from 2-D difference equations such as 

2 b(k,2,)y {m«k,n-i) = ^ a (k,>.)x(m-k,n-^.) (D.2) 

Here I^, I^ are finite sets of pairs of integers. 
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Let us first discuss the problem of recursion. Given the equation 
(D.2), we want to use it to calculate the next output given previous outputs 
and the input. Unlike the 1-D case, in which the index n has the interpretation 
of time, in the 2-D case, in general, it is not clear what "next" or 
'• £>revious '• mean. In fact, just given (D.2) it is not clear that there 
is any definition of next or previous that will allow us to compute y(m,n) 

■ j 

recursively. Dudgeon [D-1] , pistor [D-12], and Zkstrom and Woods [D-28] 
have studied this problem in great detail . Let us consider one of the 
most important special cases of (D.2) jin which A=i and b has its 
support as indicated in Figure D. 1. We then have 


y(m,n) 


b(0,0) 


M N 

2 2 

k=o 

(k,Ji)#(,o,o) 


b(k,il)y (m-k,n-/.) + 


b(0,0) 


x(m,n) 


(D.3) 


Note that from (D.3) and the figure, it is evident that we must store 

values of y(k,J?.) for (k,Jl) to the south and west of the domain over which 

we wish to calculate y. If this domain is infinite in either direction, 

the required storage is also infinite. In fact the required storage 

grows linearly as we increase the domain in either direction (see [D-1] 
for details). Thus storage requirements in 2-D depend on far more than 
the order (M,N) of the filter. ' 

• We also find ; that the storage requirements depend on the sequencing 
of the recursion. Several directions of recursion are indicated in 
Figure D.2. Each recursion calls for its own sequence of data accessing 
and discarding. The N and E recursions appear to nave particularly 
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simple sequencing rules, but the data must be processed serially. On 
the other hand, the ME recursion has ai more complex sequencing but 
leads to the possibility of parallel computation, since,: for example, 

points 4, 5, and 6 can be calculated simultaneously. The possible directions 

! 

for recursion and potential use of parallel computation cah be determined 
with the aid of a conceptual device — the precedence relation [D-26] , 
which partially orders points with the rule (m,n) (5',k) if y(m,n) 

must be calculated before we can calculate y(’-,k). 

Let us now return to the question of recursibility. Clearly the 
picture is symmetric — i.e. , we can have NW, SE, and SW recursions, with 
b(k,S.) restricted to be a function on the corresponding quadrant. However, 
as shown by Dudgeon [D-IJ , this by no means exhausts the possibilities 
for recursion. In addition to the one quadrant functions, we can obtain 
recursive difference equations with b (k,^.) ' s that are one-sided [D-i] . : 

In this case the support of b is as in Figure D.3, and we can calculate 

i ' ■ ! , ' 

y(m,n) column by column, recursing to the north and then shifting to the 
next column to the east. 

Let us make einoth^r connection with 1-D processing. Suppose that 
one of the two indices, say m, has the interpretation as time. Then 
one might think of y(m,n) and x(m,n) as (1-D) spatially distributed 
processes that evolve in time. Temporal causality might then correspond 
to .the support of b in Figure D.3 being modified by deleting the points 
on the. positive n axis, yielding a "strictly" one-sided function. In 
this case, one could define the "state" of the system, and it is clear 
that this "state" will be finite dimensional only if the range of n is 
bounded,' which is precisely when the required storage for the 2-D. 
recursion is finite. 




I 

Figxjre D.3 Support of a One-Sidad Function 
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As mentxoned earlier, the ability to solve a 2-D difference equation 
recursively leads directly to the definition of a partial order on the 
part of the 2-D grid over which we wish to solve the equation. Given 
this precedence relation , one then has some freedom in I deciding how to 
sequence the calculations. Specifically, if we think of a sequence of 
calculations as determining a total order on the part of the 2-D grid 
of interest, all we require is that this total order be compatible with 
the precedence relation i Once we have such a total order, we can either 
view this as transforming 1-D filters into 2-D filters or vice versa 
[D-15] . One widely used order is the line-scan (D-1,5,151 ; 

(i/j) < (Jlfk) if i<2, or ±=i and j<k 

AssxOTing we are interested only in lines of finite extent, we can 
readily see one of: tiie problems with this order and with orders in • 
general. If we attempt to process the order input data with a 1-D 
LSI. system, the resulting 2-D processing is not SI, essentially because 
our ordering has placed the last point on one line "next" to the first 
point on the next. 

We close our discussion of 2-D orders and precedence relations 
by noting that these very same issues arise naturally in certain feedback 
control problems. Ho and Ghu [D-21] consider optimal control problems 
in which one has a set of decision makers who base their decisions 
on certain observed data. Ho and Chu define a precedence relation 
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among decisions: j-<L if the decision of j affects the observation of 

i. They assume that this is a partial order — i.e. that if j-<i, we 
cannot have i-^j (this is precisely the condition needed for recursibility 

: I 

of 2D filters). Then, under a "partially nested information condition" — 
if j^i, theii i's observation includes knowledge of j's observation— they 
solve an optimal control problem. Witsenhausen [D-22] has also, studied 

: I ! .. . . : 

this partial order and has poihted out that if one totally; orders the 
set of decision makers in a way compatible with the precedence relation, 
one can then define the state evolution of the system. Hence we see 
that there may be I many possible sets of states corresponding to different 
compatible total orders (just as storage requirements vary with the choice 
of recursion). ; 

In the preceding discussion we have seen that the presence of a 
partial order as opposed to the usual 1-D total order leads to some 
complications. New difficulties are also encountered in the study of 
the stcibility of recursive 2-D filters [D-1,13]. As in the 1-D case, 
the stability of the filter depends on the direction of recursi<i>)?i, 

^ i 

and there are many more possibilities in 2-D. In addition, although 
there are analogs of results such as those in 1-D that test to see 
if all poles are inside the unit circle [D-1,9] , the required calculations 
are far more complex. This increase in complexity also arises in the 
related : problem of the stabilization of a given 2-D system function 
while keeping the magnitude of the frequency response ’unchanged. In 
1-D this is done easily by shifting those poles that lie outside the 
unit circle, but this cannot be done that easily in 2-D, since we cannot 
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factor 2--D polynomials.* 

Another stabilization approach in 1-D is spectral factorization — 
i.e. we write a given rational H(z) as the product of two pieces, 

H (z) and H (z) , where H has all its poles inside the unit circle (and 

£ W E 

hence is stable if used to process inputs in the eastern direction) and 
has all its poles outside the unit circle (stable to the west) . 

Thus, in 2-D, one is tempted to seek factorizations into four stable 
quadrant filters [D-12] or into two stable half-plane filters [D-1,16,28] 
much like the 1-D case. Such techniques have been developed using 
2-D cepstral analysis, and we refer the reader to the references. We' 
do note that the lack of a fundamental theorem of algebra does mean 
that the factors in these factorizations will not in general have 
finite order denominators. 

A final stabilization procedure is based on the guaranteed stabi- 
lity in 1-D of least squares inverses. The least squares inverse (LSI) 
is obtained using exactly the methodology one brings into play in 
performing linear prediction of speech , Given the denominator B and 
its inverse transform b, one seeks a finite extant impulse response p 
that approximates the convolutional inverse of b by choosing the 
coefficients in p to minimize the sum of the squares of the difference 
between b*p and the ion it impulse. In 1-D, one has the guarantee that 
p is minimum phase (i.e. that the all pole modal 1/p is stable). In 
[D-13] Shanks, et al. , conjectured that this minimum phase property holds 
in 2-D. Under this assumption, they proposed the use of a double least 
squares inverse to stabilize and unstable denominator of a NE filter. 

*This is often referred to as the "absence of the fundamental theorem 
of algebra" for multivariable polynomials (see, for example, [D-1] ) , 
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Using this design procedure, numerous 2-D filters have been designed. 
Unfortunately, Genin and Kamp [D-53] have recently shown that, this 
conjecture is false in 2-D if one constrains oneself to quarter-plane 
filters (although it is true in certain restricted cases [D-50] ) . On 
the other hand, Marzetta [D-43] has shown that the desired minimum 
phase property does hold if the least squares inverse problem is posed 
in terms of half-plane filters. We will return to this point again 
later. 


As in the 1-D case, a critical question in the design of 2-D HR 
filters is the existence of limit cycles a.nd the effect or roundoff 
noise on the filter output. The results in lD- 10] on the existence of 
horizontal, vertical, ,ind noninteracting diagonal limit cycles parallel 
1-D results. Open questions involve the intrigiaing question of whether 
one can extend any of the other techniques discussed in Section C. 

Do the passivity-Tsypskin-positive real-frequency domain results of 
Claasen, et al. , [C-9] and others extend to the 2-D case? t^hat about 
the Lyapunov techniques of Willson [C-2]? Of course in this case one 
would need 2-D state space models and a 2-D Lyapunov theory, which in 
itself might be of interest in providing a method to test for stability 
of 2-D LSI systems even with perfect arithmetic. 

The analysis of roundoff noise in 2-D filters can be carried out 
much as for 1-D filters, but another open question concerns the exten- 
sion of the covariance noise analysis method described in Section A 
for 1-D roundoff analysis. Again one would need a state space model 
in order to consider this question. 



D.2 Two-Dimensional State Spae^ Mo d els 

I • ^ i ' ' 

In addition to 1-D state space cescriptions for r|ecursively 
ordered 2-D systems [D-26 ] , some work has been done on the analysis 
of 2-D state space models. Roesser [D-27] considers the NE model 

v(i+l,j) = A^v(i,j) + A 2 h(i,j) +B^x(i,j) 

I' 

h(i,j+l) = A^v(i,j) + A^h(i, j) + B 2 x(i, j) (D.4) 

y(i,j) = C^v(i,j) + C 2 h(i,j) + Dx(i,j) 

here x is the inpub, y is the output, and v and h together play the 
role of a "state" variable, "camping" vertical and horizontal information, 
respectively. Given this model, Roesser considers several issues, 
including a variation of constants formula to solve (D.4), a 2-D 
version of the Cayley-Hamilton Theorem, which in turn is used to obtain 
an efficient method for computing the transition matrix, and the notions 
of controllability and observability. In obtaining his algorithm for 
recursively computing the transition matrix via the Cayley-Hamilton 
the'orem, Roesser used the notion of 2-D aigenvaluas in a crucial manner, 
amd in the usual nonfactorizable case the calculation of zeroes of a 
characteristic polynomial is extremely difficult. This not only 
complicates his transition matrix algorithm, but it makes stability tests 
more difficult, as we have already mentioned. Furthermore, the model 
(D.4) is limited to quadrant-causal systems. This is perfectly reasonable 
for the study of quadrant-recursive filters, but its value for the 
analysis of other 2-D signals is unclear. For example, Roesser mentions 
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the possibility of a 2-D filtering theory, whs 
of a "spatial shaping filter." As Ekstron and 
one cannot obtain arbitrary spectra from a KE 
one may need two such filters, as well as a me 


spectra of the signal field. Finally, we note 
(v(i, j) ,h(i, j) ) might better be termed a "loca 
saw earlier, in recursively solving 2-D scuati 
of storage in general depends on the size of m 
while dimensions of v and h correspond to the 


re (D.j4) plays the role 
Woods [D-28] point out, 
shaping filter. Hence, 
thod for modelling the 
that Roesser’s "state" 

1 state" [D-24] . As we 
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Issues of this type have also been considered by Fornasini and 
Marchesini [D-24] . They consider local WE stare space descriptions of 
the form 


x(m+l,n+l) = AQx(m,n) + A^x(m+l,n) + A^x-m^n+l) + Bu(m,n) 

y(m,n) = Cx(m,n) (D.5) 


Theyj show, that a NE HR filter can be realized as in (D.5) if and 
only if the transform of the impulse response is rational. The "if" 
part of this result involves the construction of a realization ' that 
is a generalization of the 1-D "standard controllable form." Having 
such realizations, attention naturally focusses on minimality. This 
leads directly to the notions of (local) conorollability and observability, 
with finite rank conditions for these prcperoiss being developed in a 
manner analogous to that of Roesser. The main minimality result of 
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ii 

March^isini and Fornasini is that raininality inplies local controllability 
and observability but that local controllability and observability ^ not 
imply minimality. 

Attasi [D-23] has studied a special case of (D.5), in which 





-AjRl 


( 0 . 6 ) 


In this case, the system transfer function is separable (z^) ) » 

and, as shown in [D-25] , this is the only case in which the global 

state is finite dimensional. As any FIR filter can be realized by 

(D.5), (D. 6) ,: any stable impulse response can be approximated i arbitrarily 

closely by a System of this form. This, of course, is neither startling 

norj necessarily very useful, since the dimension of the resulting state- 

space system may be extremely large. Having chis framework, Attasi defines 

dual notions of local controllability and observability and derives 

conditions somewhat simpler than in [D-24,27] because of the assumed 

separability. Attasi also considers minimal realizations, obtains a 

state space decomposition result, and minimal realization algorithm much 

like those in 1-D, and shows that minimality implies controllability and 

observability. He also proves the converse of this last result, but 

this is only true if one looks for the minimal realization in the 

class of models satisfying (D.6). We refer chs reader to [D-24] for an - 

example illustrating these points. 

Undoubtedly the major contribution of Attasi 's work is that he 
did something with his models, and we will discuss his filtering 
results in the next subsection. He also developed a 2-D Lyapunov 
equation, which he used in a generalization of the "invariance principle," 
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. The exact implication of this resulc for 2-D stability theory 
and its potential utility in such areas as liziit cycle analysis remain 
as questions for further work. Attasi also considers systems as in 
(D.5), (D.6) which are driven by 'White noise, .^gain he obtains a 2-D 
Lyapunov equation for the state covariance, and this result may be of 
some value in performing roundoff noise analysis for 2-D filters. In 
addition, he develops a stochastic realization oheory that exactly 
parallels the l-D" case -with .one rather surprising exception, in that, 
unli.ke the 1-D case, in the 2-p case the stcohastic realization is 
essentially unique. This is due primarily oo one additional constraints 
imposed by the-fact that we use a single quadrant shaping filter. 

Another novel feature of Attasi's developmeno is the necessity for 
using non-square factors — i.e. to perform the required factorization 

S(Zi,Z2) = H(Si,Z 2)H' (Zi ^,Z2 /■) (D.7) 

where H is NE causal arid of the form (D.5), one must consider 

rectangular factors. For example, if y is a scalar process, then 
H in general must be Ixm, and, in fact, the aforementioned uniqueness 
result fixes the value of :m. 

Recently, Morf, et al. ,i [D-31] have made several noteworthy 
contributions to 2-D stats space theory . They consider the properties 
of polynomial and rational matrices in two variables in order to 
generalize the scalar 2-D polynomial results of 5ose [D-30] and the 
matrix 1-D polynomial results of Rose.nbrock i3-32] and Wolovich [D-33] . 
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The concepts of local controllability and observability for the Roesser 
model are explored in [D-31] , and the authors point out that these 
conditions neither imply nor are implied by the minimality of the 
realization (this is done with several instructive examples). To obtain 
notions of controllability and observability that are equivalent to 
to minimality, Morf, et al., generalize the approach of Rosenbrock, | 
and this leads to the notions of modal controllability and observability 
and a related concept of minimality. In this setting the existence of 
minimal realizations becomes a difficult problem, and one may not even 
exist if we restrict ourselves to systems with real parameters. In 
related work, Sontag [D-29] has also found realizations of lower 
dimension than those proposed by Fornasini and Marchesini, and he has 

i : : . 

shown that minimal realizations need not be unique up to a change of 
basis. All of these facts indicate that the 2-D state space model is 
an extremely complex one and offers some extremely difficult mathematical 
and conceptual problems. It remains to be seen whether any of these state 
models and realization theories can provide a useful framework for 
solving 2-D analysis and synthesis problems. 

D.3 Image Processing, Random Fields, and Space-Time Processes 

Digital processing of images for data compression, noise removal, 
or 'enhancement is one of the major areas of applications of 2-D digital 
signal processing techniques. In addition, Lmage processing has spurred 
a great deal of work in the analysis of spatially-distributed stochastic 
variables— random fields. Let g(i,j) denote the image radiant energy 
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as a function of two discrete spatial variables, where, for the time 
being, we will assume that the system is free of noise. The image 
results from an image fozmiation process that transforms the original 
radiant energy f(i,j) into the observed image. A general model that 
is often used for the image formation process is 

N 

g(i,j) = S h(i, j,k,^)f (k,^) i,j=l,...,N (D.8) 

where h is the point-spread function (PSF) , which models the smoothing 
and blur that take place in the image formaticn process [D-4, 19,46). 
Note that one important case of (D.8) is the shift-invariant case, in 
which h depends only on i-k and j-2-. in this case (D.3) is a 2-D 
convolution. 

In addition to the image formation process , one must take into 
account the process of image s'e^ording and storing. Several noise- 
corrupted nonlinear image models have been developed [D-19,46] for 
this; however, as discussed in [D-46] , often one may be able to justify 
the use of an additive noise model 

q(i/j) = g(i,j) + v(i,j) (D.9) 

where v is an additive noise process. We now curn our attention to 

i I i 

the analysis of this model. 

At various points in this development, it will be more convenient 
to view f,g,q, and v as vectors by performing a scan (lexicographic) 
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. 2 2 
ordering, in which case we write q=Hf + v, where H is an N xN matrix 

formed from the PSF. Examination of (D.8) yields that H is an NxN 

matrix of NxN blocks {h. where the (m,n) element of H. . is 

h(i,m,j,n). If the imaging system is shift-inj/ariant, it is readily j 

seen that H is block Toeplitz, and, in fact, each of the blocks is 

itself a Toeplitz matrix. Note also that if h is separable, then 

h(i,j,m,n) - h^ (i ,m) h^ ( j /H) and H = (g) 

where ® denotes the teneor or Kronecker product, and is an 

NxN matrix obtainable from h! . 

'1 

It is evident from' the preceding development that probabilistic 
andj statistical methods must play some role in image processing .i In 
this context, f,g,v, and perhaps h are random, fields . For now we 
consider such a random field s(i,j) to be characterized by its mean 
s(i,j) and its covariance 

r(i,j,m,n) = E{[s{i,j)-s(i,j)] ts(m,n)-s(m,n)] } (D.IO) 

The field -will be called (wide-sense) stationary if 
i r (i, j ,m,n) = r(i-m,j-n) (D.ll) 

I ■:- . : ’ ■ '■// ' . i 

Note that if s is ordered lexicographically, then its covariance R 

I " V 
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2 2 

is the N X N matrix obtained from r in the sar.e manner that H is 
obtained from the PSF h. We also observe that R is block Toeplitz 
with Toeplitz blocks if s is stationary. 

One important problem in image processing is the efficient repre- 
sentation of images for storage or transmission [D-46,47]. One well- 
known method for obtaining a less redundant representation of an image 
is the Karhunen-Loeve transform (D-46] , which involves the diagonalization 
of R. However, in general, this tramsform involves exorbitant amounts 
of computation. There are, however, several special cases in which this 
transform can be calculated efficiently. Cne of these, motivated by 
similair analysis performed by Hunt I D-46] and Andrews and Hunt (D-19] , 

is quite instructive. Suppose that s is stationary, and that any 

* 

particular pixel is uncorrelated with ones some distance d away. Then 
the block Toeplitz covariance matrix is nonzero only near the 
main diagonal (and the seune can be said for each of the blocks) . 

We now modify R amd its blocks, to make R block circulant with circulamt 
blocks. A block circulant matrix is block Toeplitz with each row a 
cyclic shift to the right of the preceding one, where the last block 
on the right of one row becomes the first block on the left in the 
next row. This merely means replacing some of the zeroes in R with 
nonzero entries. Intuitively, imagining the image as a flat array, we 
have connected opposite edges, first to creame a cylinder, and then a 
torus. The reason for making this approximation is that the matrix 
of eigenvectors of R^, the circulant modification of R, can be computed 
efficiently using the fast Fourier transform (see [D-19, Z-1] ) , and 
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thus the Karhunen-Loeve expansion can be performed quickly,. 

As discussed in Section B, one of the r.ost widely used coding or 
compression schemes for 1-D tine series, such as speech, is linear 
prediction, in which we design a one-step predictor or inverse whitening 
filter for the time series. This method has several appealing features 
in 1-D — it is efficient (if one uses the Levinson algorithm) , it leads 
to recursive coding and decoding algorithms, and it yields excellent 
performance. In 2-D the situation is not as clear. What direction do 
we predict in and what old data do we use tc do the prections? At this 
time, answers to thesa questions are beginning to be formed. Genin 
£Uid Kamp [D-53] have shown that NE predictors need not be minimum 
phase, and Marzetta [D-43] has provided an arg-ment for why this is 
in fact the case. Specifically, in 1-D, we are guaranteed that the 
optimal predictor for y(n) based on y(n-l), y (n-2) , . . . ,y(n-r) is 
necessarily minimum phase; however, if we skip some points in the 
past — e.g. , if we predict y(n) based on y (n-1) ,y (n-2) , and y(n-4) — 
the optimal predictor may not be minimum phase. Marzetta points out 
that NE predictors ^ skip points For exam.ple, consider the predictor 
of y(m,n) based on y(m-l,n), y(m,n-l), and y(r.-l,n-l). If we totally 
order the points in the plane in a fashion compatible with the partial 
order for calculating points recursively to the ITE, then (m-l,n) , 

(m,n-l) , and (m-l,n-l) will never be the three immediate predecessors 
of (m,n) . Thus, just as in 1-D, there is reason to expect the 
optimal predictor to be minimum phase. Marzetta then points out that 
if we don’t skip points — i.e. if we use a full half-plane predictor — 
we do get the minimum phase properties, Levir.son-type algorithms 
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involving reflection coefficients, etc. Note that this predictor is 
primarily of conceptual interest, since the predictor involves the 
incorporation of an entire, infinite extent col-umn before any points in 
the preceding column may be included. We refer the reader to [D-43] 
for details euid for practical, suboptimal methods which also have the 
minimum phase property. We also refer the reader to [D-511 for another 
generalization of the Levinson algorithm to 2-D. 

We now turn our attention to the problem of restoring blurred and 
noise-corrupted images [D-4,19,46J. A numdjsr of nonrecursive methods 
have been developed for the removal of blur and for noise rejection — 
inverse filtering, 2-D minimum mean-square error (MMSE) Wiener filtering, 
etc., and we refer the reader to the survey [3-46] for more on these 
methods and for further references. We merely point out here that 
techniques such as the Wiener filter have some difficulties and limitations 
as image processing systems. To a great extent this is due to the 
fact that the MMSE criterion is not particularly well-suited to the 
way in which the human visual system works [3-20]. In paurticular, the 
Wiener filter is overly concerned with noise suppression. In addition, 
in order to make the filter computationally feasible, one often assumes 
stationarity. This in turn leads to a filter that is insensitive to 
abrupt changes — i.e. it tends to smooth edges and reduce contrast. On 
the other hand, in high contrast regions, the human visual system 
will readily accept more noise in order to obtain greater resolution. 
Several schemes have been proposed that are aimed at trading-off between 
the potentially high-resolution, poor noise performamce of the inverse 
filter and the lower-resolution, good noise performance of the Wiener 
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filter. One of these is the constrained least squares filter , [D-4,19]. 

Several other observations can be nade concerning the processing 
systems mentioned so feu:. As mentioned earlier they are nonrecursive and in 
principle require the block processing of the entire image or subs- 
tantial sections of the image. Hence the ccncutational burden of 
these schemes can be quite high. In 1-D, one find that recursive 
methods are often preferable to nonrecursive ones because of their 
computational advantages. As discussed in iI-19] the 1-D Kalman filter 
offers great computational savings over nonrecursive methods, and an 
appealing question is the extension of such filters to 2-D. Anyone 
familiar with 1-D Kalman filtering theory realizes that the design of 
the filter relies heavily on a dynamic representation of the received 
signal. Hence, to develop such techniques in 2-D, we need a more 
complex model of an image than that provided by the mean and covariance. 

The need for the use of such models is an obvious drawback to this 
approach, but the potential gains in computational efficiency represent 
a distinct advemtage. 

One approach to recursive processing of images involves the 1-D 
processing of the scam-ordered image (see subsection D.l). This work 
has been developed by Nahi, Silvemam, and their colleagues [D-2,5,7,34] . 
Suppose we have an image f(m,n) (assumed to be zero mean for convenience) 
with stationaury covariance r(k,i), and we obser’/e q*f+v where the 
additive noise v is, for simplicity, assumed co be zero mean and white, 
with vaxiamce R. We now take the scan ordering of the NxN grid on 
which q, f, and v are defined. Let us use the same symbols to denote 
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the resulting 1-D processes. We then have 

E[f(k)f(i)] » S(k,Z) k,Z-l, (D.12) 

where S(k,2,) can be calculated fron knowledgs cf r(ra,n). Note that 
the scanned image f (k) is not stationary due ~.o the abrupt chemge that 
oticurs when the scanner reaches the end of one line and Logins the 
next. We wish to use Kalman filtering tazhniruss in order to suppress 
the noise. In order to do this, we need a snate space model for f. 
Unfortunately, as pointed out in [D-5] , doss not have the 

required separability that is needed in order for such a realization 
to exist. Hence, some sort of approximation is needed, amd several 
have been developed. The simplest of these involves finding a 
stationary approximation R(k) to (D.12), moh as Manry and Aggarwal 
found shift-invariant approximations to the shift- varying scanning 
filters they studied in [D-15] . Having R(k) , one can then use some 
realization procedure to find a Markov model thao realizes or approximates 
the given correlation function. 

We can now obtain an image restoration szheme by direct appli- 
cation of Kalman filtering. Several commenos are in order. We first 
note that the filter has an artificial causalioy — only the points below 
and to the left on the same li.ne affect the esoimate of a given pixel. 

This can be partially removed by the use of snoothing. With the model 
we have developed, this can be done efficienoly with two Kalman filters, 
scamning in opposite directions and starting ao opposite ends of the 
image. The resulting estimate still has diffirolties because of the 
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effects at the ends of lines. In this case, one can -emove some of 
these difficulties by transposing the image and performing the same 
type of processing again — we then have NE, SE, and SW Kalman 
filters. 

The recursive methods discussed so far have assumed that there 
is no blurring due to a nontrivial P3F. If there is such blurring, 
essentially we must develop * scan-ordered, 1-0 dynamiical model for 
the effect of the blur and then incorporate this model into our 
Kalman filter. The simplest example of this — motion blur along the 
direction of the scan — was considered by Abcutalib and Silverman [D-7] 
(see (D-54] for consideration of more general blurs) . Again this 
system offers computational advantages over nonrecursive schemes, but 
the restoration system may be very sensitive to errors in the knowledge 
of the PSF. 

The previous technique did not di.rectly use a 2-D recursive model 
for the image. The first work along this line was that of Hedaibi [D-6] 
who considered a 2-D, recursive, auto-regressive shaping filter 

x(k+l,Ui) = p 2 x(k+l,^) + D^x(k,Ui) - o^:^x(k,70 + (l-Pj)^^ , ^) 

(D.13) 

where w(k,i) is a white, zero mean, unit varience process. Assuming 
measurements of the form \'»x+v, Habibi then developed a suboptimal 
estimator to estimate x(k+l,?^l) based on {y(m,n) !m^<, n^^}. The 
suboptimality of HadDibi ' s estimator arises essentially because x is 
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only the local state, and one needs to estimate the global State for 
optimal filtering. The most complete study of optimal 2-D Kalman 
filtering has been performed by Woods and ?adevan (D-41J . We assume 
that we have a one-sided causal dynactic model for the ramdom field 


M +M 

x(m,n) ■ y y b(k,£)x(n-k,r.-^l) 

)c»l il— M 
M 

+ y b(0,Z)x(m,n-I) - w(r.,r.; 
A-1 


(D.14) 


Suppose we want to estimate x(n,n) given all values of q»x+v in the 
past, where past is defined relative to the direction of recursion 
in (D.14). Woods amd Radewam point out that this cam be done op- 
timally with an extremely high dimensional :<alr.an filter to estimate 
the global state of the system, which in this case has dimension on 
the order of MN (M»order of the filter, ^;»•«•idth of the image) . 

Optimal line-by-line Xalmam filtering for images has also been 
considered by Attasi [D-231 using a stochastic version of the model 
discussed in Subsection D.2. Specifically the image is assumed to be 
generated by a separable vector analog of the model used by Habibi [D-6] 


x(i,j) * F x(i-l,j) + F x(i,j-l) - r,“ x(i-l,j-l) + w(i-l,j-l) 

X ^ ^ £ 

(D.15) 

q(i,j) » f(i,j) + v(i,j) * Hx(i,j) - v'i,:' 


We wish to obtain the optimal estimate x(m,n' cf x(m,n) given q(i,j) 

for i<m and all j. The optimal estimate in this case consists essentially 
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of two 1-D operations. Suppose we have x(n-l,n) for all n. We first 
predict ahead one line to obtain 

x(m,n) « Fj^x(m-l,n), for all n (D.16) 

Note that each of these estimates is calculated independently. We now 
observe the new line of measurements q(m,r.) for all n, and we create 
the error process and the error measurement 

e(m,n) =» x(m,n) - x(m,n) (D.17) 

y(m,n) = q{m,n) - Hx(m,n) = He(m,n) + v(m,n) (D.18) 

Thus we have a 1-D estimation problem — estimate e(m,n) for all n, 
given y(m,n) for all n. Attasi shows that one c«ui obtain a finite 
dimensional 1-D realization for e(m,n) as a function of n. Hence, 
this estimation problem reduces to the usual 1-D smoothing problem. 

The solution consists of two 1-D Kalman filters starting at opposite 
ends of the line. Furthermore, the optimal smoother can again be 
implemented with two filters of the type devised by Attasi — one 
sweeping the columns in order of increasing m, and the other in order 
of decreasing m. This is reminiscent of the decomposition of 
zero phase filters into two half-plame filters [D-12,28]. 

The method of proof used by Attasi i .’olves the talking of z-trans- 
forms along the n direction and the treatment of m as a time variable. 





Essentially we regard the 2-D system as a high-dimensional (infinite 

if the domain of n is unbounded) 1-D system, where we use a spatial 

tramsform "along" the 1-D state vector in order to simplify the 

calculations. The key step in Attasi's develcpment is a derivation 

of a set of Riccati equations, parametrized by the transform variadsle 

z, for the power spectral density S (z) of e(m,n) considered as a 

m 

function of n. One can then factor these spectra to obtain the 1-D 
realizations of the e's. 

Methods which transform the data in one direction in order to 
simplify or to study recursion in the other have also been used 
in several other image processing schemes. Fcr example, a method 
very similar to Attasi's was used in [D-52] . In addition, Jain and 
Angel [D-11] have considered fields descriced by a nearest neighbor, 
interpolative equation [D-3] 

x(m,n) = otj, [x(m,n+l) + x(m,n-l)] + x(m-l,n)] + w(m,n) 

(D.19) 

Following iD-11) , let us consider the vector scaji process — i.e. we 

process am entire line of observed data, i~x-v, at a time. Defining 

the resulting 1-D vector processes x , y , w , and v , we cam write 

m m r. m 

(D.191 as 


^m+1 


" 

m 


X , + w 

m-1 m 


(D,20) 
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where Q is a symmetric, tridiagonal, Toeplitz matrix. Jain and Angel 

point out that the diagonalization of Q, can be performed 

with the aid of the FFT without emy approxir.ation. Thus, if we 

define the transformed quemtities x , v , etc., (x >» M'x ) we obtain 

mm mm 

a set of N decoupled estimation problems, indexed by j (which indexes 
the components of the transformed vectors) : 


Vl,j 


X.x . - X , . + w 

3 m ,3 m-l,j ra ,3 


(D.21) 


y . “ X . + V 

m,3 m,3 m,3 


(D.22) 


Each of these problems can be solved using a Kalmain filter, and we obtain 
an extremely efficient implementation — transform the observations, 
solve the low-dimensional decoupled estinatior. problems (perhaps in 
parallel) , and tramsform bac)c. 

As we have seen, optimal 2-D Kalmeui filraring algorithms require 
large amounts of storage and computation. Thus, -he study of suboptimal 
estimators that require less computation is cf importance. One suboptimal 
filter developed in [D-41] is the reduced updare Kalman filter. Examining 
the optimal filter of Woods amd Radewan, we see that the predict cycle 
is computationally straightforward — one simply uses the recursion (D.14) 
assuming no noise and using preceding estim.ames. The measurement update 
part of the optimal filter, on the other hand, involves updating the 
estimates of all of the components of the stare. Assuming N»M, we 
expect that a given pixel is most correlated only with a small percentage 
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of the elements of the state vector. Therefore, it seems reasonable only 
to update the estimates of those components of the state that are within 
a certain distance of the point being processed — i.e., we constrain meuiy 
of the gain elements to be zero and essentially allow only "near neighbor 
updates. " 

We have now surveyed a number of nonre cursive and recursive esti- 
mation methods. The recursive techniques ccr.e with many of the same 
criticisms that were made concerning nonrecursive filters. They require 
detailed models of the image statistics and trace formation process, and 
they are essentially based on the m:- 1SE critericn. Hence, they in general 
will sacrific resolution in favor of noise suppression. In addition, 
these recursive techniques necessarily affect the image because of the 
assumed model structure. Some of the recursive techniques allow the 
inclusion of image blur, while in other cases the extensions to include 
blur have yet to be developed. Also, we have seen that in some cases 
optimal Kalm£m filtering is extremely complex, and suboptimal, but 
intuitively appealing, recursive filter structures must be used. In 
other cases we have observed that the use cf the structure of the 
assvimed model can lead to extremely efficient cpcimal estimation algo- 
rithms (with the aid of transform technique s^ . In addition, although 
work in this area has been limited in extent 11-7,34], the recursive 
techniques are directly cunenable to the analysis of space-varying and 
nonstationary models. Thus, in spite of the ra.ny qualifications, we find 
enough positive attributes to warrant continued study of recursive 
techniques for image restoration. 
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One import 2 mt area for future work involves ohe reliamce on a priori 
information. As mentioned earlier, one often tan ass'jme knov/ledge of 
the PSF or can determine it by observing kno>T. test scenes through the 
imaging system. In other cases, we may not have such information amd 
must estimate the PSF as well as the image. Thus one important question 
concerns the robustness of these techniques in the face of modelling 
errors. As mentioned in Section A, techniques ft exist for the 
sensitivity analysis of 1-D state-space models a.-.i 1-D Kalman filters. 

Can we extend these methods to the 2-D case, and hew well do the 2-D 
algorithms perform? In addition, methods ahfund in 1-D for on-line 
parameter identification and adaptive estim.atien in the presence of 
un)cnown parauneters. Cam we apply these methods with any success to 
the 2-D problem? 

A second area of concern is the resoluticn-noise suppression 
tradeoff. As mentioned earlier, the humam visual system is willing 
to accept more noise in certain regions, such as edges, in order to 
improve resolution. Thus, in relatively slowly varying regions of 
the image, we would like to remove noise, whale where there are abrupt 
scene changes or other high frequency fluctuations of interest, we 
would prefer to forego noise suppression in favor of resolution [D-4] . 

In this context am important problem is the detection of edges or 
bovmdaries between different regions in an image, /re also note that 
in many applications the determination of the hcur.daries themselves may 
be the key issue [D-35). In recent years a variety cf techniques have 
been developed for detecting and recognizing various types of boundaries 
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in 2-D data (as an example, see (D-48]). In 1-D, a variety of recursive 
techniques have been developed for the estir^tion eu\d detection of abrupt 
changes in signals [D-44] . These techniques have been successfully applied 
in a wide variety of applications, and important question then is 
the extension of methods such as these to the detection of boundaries 
in images (see [D-8,35] for some work along these lines). 

Throughout this subsection we have seen several examples of 2-D 
signal processing problems in which good use is made of the transformation 
of the signals obtained by considering them zc he 1-D vector time signals, 
in which the other independent spatial variable is used to index components 
of the vectors. There are, of course, many problems in which the 
processes to be studied truly are space-tim.s processes (Z-11 , and in 
mamy of these the use of 2-D concepts can of men be of great value. One 
of the best exaunples of this type arises in problems of seismic signal 
processing [D-17,18,42] in which we observe the time response of the 
earth using a spatial array of sensors. Other applications plus several 
specific problem formulations are discussed in [2-1] . In addition, [Z-1] 
contains a brief discussion of results and forr.ulations that utilize both 
1-D and 2-D tools of stochastic calculus and .martingale theory (D-36,37]. 
Such techniques are in their infancy and work continues to determine their 
utility and limitations. We note only that the problem of the lack of 
a natural total order in 2-D causes difficulties in extending 1-D 
stochastic calculus concepts to 2-D. This is not surprising, given the 
several complications that we have already discussed. 



Given the several examples described ear?.ier in this section, it 


is our contention that there is potentially much to be gained by uti- 
lizing both the perspective of 2-D signals and systems as well as that 
of 1-D space-time systems in studying problems of either type. As a 
final example of how this concept might be used, consider the study 
of large interconnected systems. In this case ve let the spatial 
variable index subsystem variables which may be vector quantities 
themselves. A general linear model then is cha recursive 2-D model 

x(k+l,i) =lA^ x(k,j) + 1 3^ u(k,j) (D.23) 


y(k,i) 


I 

j 


C^jX(k, j) 


+ v(k,1) 


(D.24) 


Much as in the analysis of 2-D Kalman filters iD-41] , the off-line 
analysis of such systems (solution of Lyapunov or Riccati equations, 
for example) , as well as the on-line implementation of centralized 
optimal controllers or estimators, may become prohibitively complex. 

Indeed the analogy extends farther, as the "nearest neighbor" 
constrained filter of VJoods-Radewan involves precisely the same philosophy 
as is used in many decentralized control and estimation problems (A-26, 
D-38] . 

. Let us note that there may be many other useful insights to be 
drawn from this type of analogy. For example, if the model (D.23), 

(D.24) falls into the class considered by Attasi, then the optimal 
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centralized Kalman filter can be efficiently implemented using Attasi's 
line-by-line optimal filter, which involves dynauaics in the transmission 
of information aunong subsystems (Attasi's 2 filters "along” each "line"). 
As a second example, in the case in which (D.23), (D.24) are spatially 
invariant, Melzer and Kuo [D-39] and Chu [D-40] made good use of the 
struc!:ure by taking spatial transforms in studying centralized and 
decentralized optimal controllers. Similar analysis is contained in [Z-1] 
for the case of a finite string with circulair symmetry. Much as' in 
the case of block circulant approximations [D-19] , this allows us to 
use FFT techniques to reduce the complexity of the on and off-line 
calculations for centralized controllers in a manner very similar to 
that of Jail) and Angel [D-11] . In addition, the use of spatial windowing 
techniques [A-1] to obtain nearest neighbor decentralized control 
algorithms may allow us to develop useful designs for such circularly 
symmetric systems. 


•% 
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Concluding Remarks 

In this paper we have examined a number zf 'z 
that have attracted workers in two disciplines— d 
and control amd estimation theory. The goal zf t 
been the acquisition of some perspective on rslat 
questions asked, methods used, amd general philcs 
researchers in these disciplines. Upon undertaki 
my feeling that such perspective would be exorere 
moting collaUaoration and interaction among rssear 
fields. Upon concluding this study, I think that 
has been thoroughly substantiated. !;ot only are 
examples of questions in one discipline that tan 
of view of the other, but also we have found a nu 
that naturally arose from combining the two point 

Each of the disciplines has its own distinct 
clearly these will and should be maintained. Zt. 
discipline cam gain from understamding the other, 
have their limitations, such as in specifying use 
and structures. On the other hand, state space r. 
tremely powerful computer-aided algorithms for nc 
design specification, etc. State space ideas als 
sider multivariable and time-varying systems. 
state space theory may prove of value to people i 
signal processing. On the other side, researcher 
have answered many crucial questions related to t 
cations into implementable designs. The deep tnc 


road research areas 
igital signal processing 
his exaunination has 
ionships eunong the 
ophies adopted by 
ng this study it was 
ly valuable in pro- 
chars in tha two 
ny initial feeling 
there numerous 
benefit from the point 
nber of new issues 
s of view. 

character , and 
the other h^md, each 
State space methods 
fui digital algorithms 
ethods provide ex- 
ise analysis, optimal 
c allow one to con- 
1 of these aspects of 
nvolved in digital 
s in digital filtering 
•trning design specifi- 
erstanding that workers 
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in digital signal processing have concerning the problems of digital 
implementation is something that researchers in control and estimation 
would do well to gain. Thus it seems clear chat a mutual understamding 
will prove beneficial to all concerned. 

Numerous questions have been raised and speculation on various - 
possibilities has been made throughout this paper, t'fhether any of these 
issues has a useful emswer is a question for the future. It is my 
feeling that many of them do, and it is my hope chat others will think 


so as well 
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